




RATAN TATA 
LIBRARY 




RATAN 

Call No. 


TATA LIBRARY 




PlU 

release for loan 


Accession No. 

This book should be returned on or before the date last stamped 
below* An overdue charge of 10 False will be collected for each 
day the book is kept overdue. I 






AN INTRODUCTION TO 
MODERN STATISTICAL METHODS 




An Introduction to 

MODERN STATISTICAL 
METHODS 


BY 


PAUL R. RIDER 


Washington University 
Saint Louis 


NEW YORK 

JOHN WILEY & SONS, Inc. 

London: CHAPMAN & HALL, LnirtnD 



Copyright, 19S9, by 
Paul R. Rider 


AU Rights Reserved 
This book or any part thereof must not 
be reproduced in any form without 
the written permission of the publisher* 


FIFTH PRINTING, JUNE, I960 


PRINTED IN U.S.A* 


PREFACE 


In the past three decades enormous advances have been made 
in the field of statistics. These advances have taken place so 
rapidly that it is not at all surprising that those employing 
statistical methods have found it dififtcult to keep pace with 
them, or that certain of the older methods, which are obsolete, 
and even in some cases erroneous—or, at the very best, crudely 
approximative in character—continue to be taught in the class¬ 
room and to be treated in textbooks which appear from time 
to time. A notable example is the use of the probable error 
or standard error in testing the significance of a correlation co¬ 
efficient derived from a sample, although this method gives un¬ 
reliable or incorrect results if there is a high degree of correla¬ 
tion in the population from which the sample is drawn, or if 
the number in the sample is small. 

It is, of course, in the theory of small samples that the 
greatest progress has been made. The theory of sampling which 
assumes that the sample is composed of a large number of items 
is inadequate for many practical purposes. Biological, agri¬ 
cultural, and other scientific experiments frequently deal with 
comparatively few observations. Sometimes the cost of obtain¬ 
ing additional observations is prohibitive; sometimes, indeed, it 
is impossible to obtain more data, as might be true in the case 
of meteorological records. In manufacturing inspection, too, 
small samples arc of frequent occurrence. 

Most of this theory has been developed and unified by R. A. 
Fisher, who has shown how to make more accurate estimates 
and how to utilize the maximum amount of information con¬ 
tained in a set of data, and has provided exact tests of re¬ 
liability and significance. Fisher's efficient methods, at first slow 
in taking hold, because not thoroughly understood, gradually 
began to gain momentum, and are now spreading rapidly. 

In this book I have endeavored to explain the most widely 
used of these methods, illustrating their application by com- 



vi 


PREFACE 


paratively simple numerical examples, so that the underlying 
principles are not lost sight of in a maze of arithmetical com¬ 
putations. The earlier chapters develop the fundamental con¬ 
cepts of statistics, so that the book is suitable as a textbook for 
a first course in the subject. It is also planned for those with 
some knowledge of the subject who wish to gain an insight into 
the more modern methods, as it leads from the classical concepts, 
through such topics as ^^Student’s” distribution and the chi- 
square distribution, to the analysis of variance and the design 
of experiments, which are the culminating features of Fisher^s 
work. 

Grateful acknowledgment is made to Professor Fisher, and 
to his publishers, Messrs. Oliver and Boyd, for their generous 
permission to reproduce, from ''Statistical Methods for Research 
Workers,” the tables of f, chi square, and the 5 per cent and 
1 per cent points of the distribution of z. 

I am deeply indebted to Dr. Churchill Eisenhart for a critical 
reading of the manuscript, and for many valuable suggestions. 
However, full responsibility for errors is, of course, my own. 

I am also indebted to various persons and various sources 
for material used in the exercises, being particularly grateful 
to Messrs. A. G. Brooks and J. B. Gibson for supplying, and for 
granting permission to use, certain data of the Western Electric 
Company. 

Paul R. Rider 

Washington Universitt 
Saint Louis 
September, 1938 



CONTENTS 


CHAPTER page 

1. Frequency Distributions. 1 

1. Frequency tables 


2. Cumulative frequency tables 

3. Continuous and discrete variables 

4. Graphic representation of frequency distributions 

5. Frequency curves. Theoretical frequency distributions 

II. Averages and Moments . 11 

6. Averages 

7. Arithmetic mean 

8. Weighted mean 

9. Median 

10. Mode 

11. Geometric mean 

12. Harmonic mean 

13. Appropriateness of different averages 

14. Partition values 

16. Variance and standard deviation 

16. Mean deviation 

17. Moments 

III. Regression . 27 

18. Regression or trend lines 

19. Transformations 

20. Multiple regression 

21. Curvilinear regression 

IV. Correlation . 47 

22. Coefficient of correlation 

23. Connection between correlation and regression 

24. Correlation table 

25. Correlation ratio 

26. Relation between correlation coefficient and correlation 

ratio 

27. Index of correlation 

28. Multiple correlation 

29. Partial correlation 

vii 







CONTENTS 


viii 

CHAFTBB PAGB 

V. The Binomial and Normal Distributions . 67 

30. Binomial distribution 

31. Normal distribution 

32'. Fitting a normal distribution to observed data 

33. Gram-Charlier type A distribution 

34. Testing the significance of a mean when the population 

standard deviation is known 
36. Fiducial or confidence limits 

36. Testing the significance of the difference between two 
means when the population standard deviation is known 
87; Testing the significance of the difference between two 
proportions 

88. Testing the significance of a correlation coefficient 

39. Testing the significance of the difference between two corre¬ 

lation coefficients 

VI. Student's Distribution . 88 

40. Student’s distribution and the reliability of a mean 

41. Confidence limits for the population mean 

42. Testing the significance of the difference between two means. 

43. Testing the significance of a regression coefficient 

44. Testing the significance of the difference between two re¬ 

gression coefficients 

45. Testing the significance of a partial regression coefficient 

46. Comparing two partial regression coefficients 

47. Testing whether a sample has been drawn from uncorre¬ 

lated material 

48. Testing the significance of a partial correlation coefficient 

VII. The Chi-Square Distribution . 100 

49. Chi square 

60. Distribution of variances and standard deviations 

61. Testing the homogeneity of several estimated variances 

62. Small samples from binomial and Poisson distributions 

63. Combining homogeneous estimates of correlation 

64. Test of goodness of fit 

66. Application to contingency tables 

66. Contingency tables with small frequencies 

Vni. Analysis op Variance . 117 

67. Comparing two variances 

68. Analysis of variance as applied to linear regression 

69. Application to curvilinear and multiple regression and 

correlation 

60. Absolute criteria in the theory of regression 







CONTENTS 


IX 

PAQB 


61. Testing the significance of the correlation ratio 

62. Testing linearity of regression 

63. Variance within and among classes 

64. Subdivision of variance into more than two portions 

65. Analysis of covariance 

IX. Experimental Design . 162 

66. Randomized blocks 

67. Latin square 

68. Factorial design and orthogonality 

69. Confounding 

70. Partial confounding 

71. Dummy treatments 

72. Non-orthogonal data 


Tables . 193 

Index . 209 







INTRODUCTION TO 
MODERN STATISTICAL METHODS 


CHAPTER I 


FREQUENCY DISTRIBUTIONS 


TABLE 1 

Frequency Table op 
Heights op a Group op 
Men 


1. Frequency tables. A frequency table is a table classifying 
a set of observations according to the numbers of them which fall 
within certain limits. It is a tabular method of exhibiting a 
frequency distribution. For example, 

Table 1 classifies the heights of a 
group of men. The table shows the 
frequency with which men of a given 
height, or rather between two given 
limits of height, occur in the group of 
346 men under consideration. The 
values 58 inches, 60 inches, 62 inches, 
etc., are the class limits^ and the 
difference between two consecutive 
class limits, here 2 inches, is the class 
interval. The mid-values of the 
classes are obviously 59 inches, 61 
inches, etc. The range of the table, 
from 58 inches to 74 inches, is 16 
inches. 

A word regarding classification and 
class limits may be worth while at this 
point. It is assumed that in the con¬ 
struction of Table 1 the measurements of height have been made 
to a sufficient degree of fineness that no doubt exists regarding 
the class to which a man belongs. If, however, we had a set of 


Height 
in inches 

Number of men 
within given 
limits of height 
(frequency) 

68-60 

1 

60-62 

2 

62-64 

9 

64-66 

48 

66-68 

131 

68-70 

102 

70-72 

40 

72-74 

13 

Total. .. 

346 




2 


FREQUENCY DISTRIBUTIONS 


data in which measurements were made to the nearest inch, we 
could not employ the same class limits as those used in Table 1. 
For a height recorded as 62 inches would simply mean that the 
measurement was between 61.5 and 62.5 inches. In such a case, 
if we wished to use a 2-inch interval, we could set the classes as 
57.5-59.5, 59.5-61.5, . . . The mid-values of these classes would 
be 58.5, 60.5,... 

Some question arises as to what disposition to make of an 
observation or measurement which falls exactly on a class boun¬ 
dary. For example, if the classes are as in Table 1 and we have a 
measurement of 62 inches, it is not clear whether this should be 
assigned to the class 60-62 or to the class 62-64. In such an 
instance there are certain theoretical advantages in dividing the 
unit of frequency between the two classes, and assigning to 
each class. 

Difficulties in the classification of raw data can usually be 
avoided by a proper choice of class limits.* 

2. Cumulative frequency tables. The above frequency dis¬ 
tribution is equally well specified if we know the number of men 
below (or above) any given height, for if we know that there are 
60 men below 66 inches in height and 12 below 64 inches, we can 
find at once that there are 60 -- 12, or 48, men who are between 
64 and 66 inches tall. If we convert Table 1 into a table showing 
the number of men below certain heights, we obtain the cumulative 
frequency table. Table lA. 

3. Continuous and discrete variables. The variable in the 
foregoing example is height. Theoretically it can be measured to 
any degree of fineness. Such a variable is called continuous. 
There are, however, variables which can have only integral values. 
Such variables are called integral or discrete. Examples of such 
variables are the number of petals on flowers (see Table 2), the 
number of spots obtained in throwing dice, the number of 
heads obtained in tossing coins, the number of children in a 
family. 

* For a good discussioii of these and r^ted questions, see Yule and Ken¬ 
dall, *'An Introduction to the Theory of Statistics,” Charles GrilB&n & Co., 
Ltd., London, 1937. 



REPRESENTATION OF FREQUENCY DISTRIBUTIONS 3 


TABLE lA 


CuifULATiys Frequency Table 
or Heights 


Height in 
inches 

Number of men 
below specified 
height (cumula¬ 
tive frequency) 

60 

1 

62 

3 

64 

12 

66 

60 

68 

191 

70 

293 

72 

333 

74 

346 


TABLE 2 


Frequency Table of Numbers 
OP Petals on a Certain 
Species of Flower 


Number 
of petals 

Number of flowers 
having a specified 
number of petals 
(frequency) 


133 


66 


23 

8 

7 


2 

10 

2 

Total.... 

222 


4. Graphic representation of frequency distributions. In the 
case of continuous variables, the accepted method of representing 



Fig. 1.— Histogram of Heights of a Group of Men. 

a frequency distribution graphically is by means of a rectangulat 
frequency diagram or histogram. This is constructed by marking 










4 


FREQUENCY DISTRIBUTIONS 


off a scale for the variable and erecting, at the appropriate posi¬ 
tions on this scale, rectangles whose areas are equal or propor¬ 
tional to the respective class frequencies (See Fig. 1.) In the 
usual case of class intervals of equal size, such rectangles will 
obviously have heights equal or proportional to the respective 
class frequencies. 

A cumulative frequency distribution can be represented by 
plotting points with ordinates equal to the cumulated frequencies, 
and with abscissas equal to the upper limits of the classes, and 



Fra. 2.—^Frequency Diagram. Discrete Variable—^Number of Petals on a 
Species of Flower. 

then connecting these by straight-line segments. (See Fig. 3.) 
For discrete variables, perhaps the best method of graphic repre¬ 
sentation is to erect, at the proper places on the scale or base line, 
ordinates equal or proportional to the frequencies. (See Fig. 2.) 
>/5. Frequency curves. Theoretical frequency distributions. 
If the size of the class interval of a distribution be decreased indefi¬ 
nitely and the number of individuals be simultaneously increased 
indefinitely, the histogram approaches a frequency curve, A fre¬ 
quency curve may be regarded as representing an idealized 
fiequency distribution. 

Suppose that the equation of the frequency curve is F « /(X). 



FREQUENCY CURVES 


5 


The function f(X) can always be multiplied by a constant factor 
which will make the area under the curve-equal to unity, and we 
shall assume that this has been done. Then we have 


f(X)dX = 1 


( 1 ) 



(If the curve does not actually extend to infinity, but meets the 
X-axis in some point such as Xo in Fig. 4, the value of /(X) is 
to be regarded as zero outside of such points. Consequently we 
can always use — oo and oo as limits of the above definite 
integral.) The proportion of items between the values X = a 
and X = 6, a < 6, is the shaded area in Fig. 4, and, provided 
(1) holds, is given analytically by 



By the artifice of defining/(X) = 0 outside the range of the curve, 



6 


FREQUENCY DISTRIBUTIONS 


we see that we have defined f(X) so that (2) gives the proportion 
of the area under the curve Ijring between any two values X ^ a 
and X = 6, a < 6. This is the exact mathematical interpretation 
of /(X). As an aid to intuition, it is often convenient to regard 
f(X)dX as giving approximately the proportion of items in the 
interval between X and X + dX, the closeness of the approxima¬ 
tion depending on how small dX is. It is desirable to write the 
equation of a frequency curve in the form YdX = f{X)dX, because 
if any transformation is made in the variable X it must also be 
made in the differential dX. (See next paragraph.) 



The most widely used frequency curve is the normal curve, 
whose equation may be written 

(3) 

a 

If we make the transformation 

X - M dX ^ 

- as - a* ax 

<r a 

(3) goes into the simpler form 

YdX - (4) 


Another important curve is the Pearson type III curve 

YdX = Xh-HX, 0 g X < 00 (6) 

The symbol A;!, called k factorial,” is defined by 

fc! * jX^e-^^dX « r(A; + 1) 

•/q 

If A; is an integer, this reduces to k{Jz — 1) . . . 3.2.1. 


(6) 



EXERCISES 


7 


Examples of ideal or theoretical discrete frequency distribu¬ 
tions are the binomial distribution 

N\ y 

Y =-0^(1 - 0 < n< 1, ^ 

X\{N-X)r^ 

x = o,i,2,...,i\r 

and the Poisson exponential distribution 

e'V y 

y--^. ( 8 ) 


EXERCISES 

1. The frequency in a class of Table A is the number of students receiving 
grades between the limits indicated, (o) Draw a histogram for the data of 
this table. (6) Construct a cumulative frequency table from the data, 
(c) Draw a cumulative frequency diagram. 

2. Table B gives the number of men of a certain group whose weights fall 
within specified limits, (a) Draw a histogram for this frequency table. 
(6) Construct a cumulative frequency table, and (c) draw a cumulative fre¬ 
quency diagram. 

3. (a) Reduce Table C to a percentage frequency basis, and (6) draw the 
corresponding histogram, (c) Construct a cumulative percentage frequency 
table, and (d) draw the corresponding cumulative diagram. 

4. Table D shows the number of lost articles turned in per day at the lost 
and found bureau of a store. The frequency is the number of days on which 
the specified number of articles were returned, (o) Draw a frequency dia¬ 
gram. (6) Construct a cumulative frequency table, (c) Draw a cumulative 
frequency diagram. 

6. (a) Draw a frequency diagram for the data of Table E. (6) Construct 
a cumulative table, and (c) draw a cumulative diagram. 

6. Form a frequency table from the data of Table F. First construct a 
tally sheet making a mark for each time that a temperature of a given number 
of degrees occurs. From this tally sheet make a frequency table of class 
interval 1®. Then choose what you consider an appropriate wider class inter¬ 
val and group the frequencies accordingly. Next construct a rectangular 
frequency diagram. Finally construct a cumulative frequency table and the 
corresponding cumulative frequency diagram. 



8 


FREQUENCY DISTRIBUTIONS 


TABLE A 


Grades Received by a Class of 
Students in an Examination 


Grade 

Frequency 

10- 20 

20- 30 

1 

30- 40 

4 

40- 60 

6 

60- 60 

7 

60- 70 

12 

70- 80 

16 

80- 90 

10 

90-100 

4 


TABLE C 

Distribution op Employees in 
a Certain Industry Accord¬ 
ing to Annual Earnings 


TABLE B 


Weights op a Group op Men 


Weight in pounds 

Frequency 

100-110 

2 

110-120 

3 

120-130 

11 

130-140 

34 

140-160 

84 

160-160 

66 

160-170 

48 

170-180 

33 

180-190 

20 

190-200 

11 

200-210 

4 

210-220 

3 

220-230 

1 

230-240 ' 

1 


Annual earn¬ 
ings, dollars 

Number of 
employees 

0- 200 

88 

200- 400 

236 

400- 600 

396 

600- 800 

386 

800-1000 

412 

1000-1200 

341 

1200-1400 

208 

1400-1600 

us¬ 

1600-1800 

es' 

1800-2000 

68 

2000-2200 

33 

2200-2400 

18 

2400-2600 

16 


TABLE D 


Lost Articles Returned 


Number of 
articles 

Frequency 

0 

84 

1 

67 

2 

37 

3 

16 

4 

6 

6 

1 







EXERCISES 


TABLE E 

Number of Children Born per Family in 735 Families 


Number of 
children born 
per family 

Number of families 

0 

96 

1 

108 

2 

164 

3 

126 

4 

95 

6 

62 

6 

45 

7 

20 

8 

11 

9 

6 

10 

5 

11 

5 

12 

1 

13 

1 




Hourly Temperatures in Degrees Fahrenheit, St. Louis, Mo., September, 1937 
(Department of Agriculture, Weather Bureau) 


10 FREQUENCY DISTRIBUTIONS 


1 




s 




ej «® « N «»o o e» o 1 -t w »-i»«ti o OJ »o eo « « q»« 2 ® »o o» 

00 f. CO S So CD o h. <0 lo >o b. to (o r« US 10 u5 ub to CO (D 


e 

SE:SgfeRS!:8SJ8SSg58SKgSSg{:81§SS8SE; 


Ok 

CDOOOO^OktO<Ot^QiOCOtO<ONOOk*H04MOOb.iHOOtHCO'<ft^t-it^eO 

oot«t«t««ot^t^b*wt«t«coc0b-t«>ocob>cocob>oor«co>ototocDcot^ 


«s 

t»«>0»e««-ib-o»HNt»^c0t*'ite0»H-^Mc0'-t0SN0SNt^c00Je0O'<« 

aOt<.t«t«b>t«OOaOOOt«t«COCOb>b>COCDt««t«b>Xt«CO>0>OtO(Ot^t^ 

P. M. 


t«o>0‘^'^Q«HMMa>toxxxcomcDcookXi-itoM<oAoo^rax 

Xb<Xt^t«aOXOOOOt«t«COCOt«'t«COXt>«COb-XXXcOtO(D<0«Db>b- 


8S8^!:S38333S:SS8{:38Kf233SDS^3S8S33 


lO 

*-no'^xo»'^'«f‘0<0'^os*-'roN«‘00>»-Hioxto^®i-<eoecc«f-ixto 
o> 00 X b> X X 00 00 x 001» CO CO X X oi X X X X X 



dX-^MQ'^'TttXOONOJMNWXXOOCXC^b-OCb-roiOi-ieO^OSX 

O>XXC0XXXXXXb.t«t^Xt»COXXt^t«XO)Xt«XXXt«t^X 


€9 

OX't<-'X^XXb>t'.X'<4«Oi-i»»tO^OXXXMO^Mtc.-iCOOOX 

o>xxxt^xxxxt«t^b>b>xt«xr^xt^r«xo>xxxxxt^Sox 



ox^xxxxxr«xxxoQX>oa)X'^xx*HXoxi-ic4b.i-)x 

Q X X X b. OO X X X X b> X CD X b. X 0> X X X X X X X X 



OXC9XXO't:3!:a!3''^Xb»b.'9««t^t»^WXO>OQX»HFHCOXX'^ 

O X X X b-X X X X X X b* X X t« X X X t« X X X X t» X 


B 



Jjf 



m 



|p!l 

xMXb«xxoxi-ixt^oxxiMr~cixxxxi-)bio)Xb.xo«xx 

X oo X X X X X X X t« X X X X X b> X X X X X X X X b. 


pin 


B 

^i-(XXt^t^XXXX^Xi-t'4«*-iceOQW'^Xt^©XCONcOXt^W 

XXb.b>XXb>b>t«b.XXXXb-XXXXXXb.Xb.XXXXXb. 


«0 

f-tQXXXXO>-«'4*XXXt^OX^XXOXXCCb-'4«XQ.-t.HCOX 

X X b« X X b> b« b* X X X X X X X X X X X b> X X X X X X 

A. M. 

»• 

XX'^XXCSIXOCl^OlXOlcDXiHAMOeOt^b.XXXXXXaOfH 

b.b.b-b>XXXb>b>t'.XXXXXXt4*XXXXXt'.b.X'<i|4'<4i'.C<iOX 


«p 

XXXXXfHiOOO-^XXXcQXQQCCMOOOXblXXXt^Xr^CI 

b.b.b>t^XXXb.b>biXXXXXXXXXXXXt«b.X'^'^^XX 


IS 

XXXXXWCOOXW'^'WeOI^XtHN^XTH® X,PHeoXQt»0»t»bl 
b*b>b>b>XXXb«Xb>XXXXXXXXXXXXb.r«XX'^^XX 





M 

b>oxxx^x«-4iHxxxcoo>oxe<«coO'<^Qt«m««i9oxxoo) 

t«t«b>r«xxxt^r«b>xxxxt^xxxr«Xxxb>t«xx'^^xx 


M 

SSSSSS3KgSS&S8r:883g888?:S:SSSS83 




Date 


-"•’’'—-••saaasasssssaaasasRMs 

























CHAPTER II 


AVERAGES AND MOMENTS 

6. Averages. An average is a quantity which may be regarded 
as representative of a group of data. We may use an average to 
characterize a frequency distribution, or we may use the averages 
of two different distributions for purposes of comparison. For 
example, we may compare the average weights of two football 
teams, or the average wages in two different occupations. 

There are various types of average, such as the arithmetic 
mean, the geometric mean, the harmonic mean, the median, and 
the mode. 

7. Arithmetic mean. The arithmetic mean, often called merely 
the mean, of a set of quantities is their total divided by their 
number. Thus, if we have N quantities Xi, X 2 , . . . , Xj^, their 
mean is 

X = ^ sx = I (Xi + X2 +... + (1) 

If each Xi occurs with the respective frequency ft, the mean may 
be written as 

^ ^ ^ (Xifi + X 2/2 + ... + X»A) (2) 

where iV = 2/ = /i + /2 + . . . + /fc. However, equation (1) 
will mean the same thing as equation (2) if we consider that some 
of the X^8 in (1) may be identical in value. 

The arithmetic mean has the 'property that the algebraic sum of 
deviations from it is zero. That is, 

. S(X - J) = 0 (3) 

In computing the mean, it is sometimes convenient to choose 

11 



12 


AVERAGES AND MOMENTS 


an arbitrary origin Then the mean may be found by the 
formula 

3r = r + ^s(x-D (4) 

For example, if we wish to find the mean of the numbers 205,197, 
200, 204, we might choose the origin 200, the deviations from 
which are 5, --S, 0, 4, respectively. Then 

X = 200 + J(5 - 3 + 0 + 4) = 201.5 

For application to a frequency table, formula (4) might be 
written in the form 

^ = + ( 6 ) 

in which x' is the deviation, in terms of the class interval c as a unit. 

Let us apply this formula to the computation of the arithmetic 
mean of the heights in Table 1. The details of the computation 
are shown in Table 3, in which the heights are regarded as con¬ 
centrated at the mid-values of the classes. 

TABLE 3 

Computation of the Mean Height of a Group of Men 


Height 
in inches 
X 

Deviation in 
class intervals 
from 67 
x' 

Frequency 

/ 


59 


1 

- 4 

61 


2 

- 6 

63 


9 

-18 

65 

-1 

48 

-48 

67 


131 

0 

69 

1 

102 

102 

71 

2 

40 

80 

73 

3 

13 

39 

Total... 


346 

145 













MEDIAN 


18 


Here X' =* 67 inches, c = 2 inches, and we find 

^ = + = 67.84 in. 

The arithmetic mean of a continuous distribution represented 
by the frequency curve Y = f(X) is given by the formula 




y* Xf(X)dX 
ff(X)dX 


( 6 ) 


or simply by the numerator if the area under the curve is taken 
as unity. 

8. Weighted mean. In the computation of a mean the vari¬ 
ous quantities may be assigned various weights. If Wi is the weight 
associated with Xi then 

___ , « , « ^iW iX. % /MV 

Weighted mean = —— (7) 

ZWi 


The weights may be somewhat arbitrary. For example, in com¬ 
puting a student*s final grade the instructor might assign various 
weights to the laboratory grade, the final examination grade, the 
test average, and the average of daily recitations, depending upon 
the importance that he attaches to the various phases of the work. 
On the other hand, they may be somewhat more definite; a stu¬ 
dent's average in all his work would in general be found by weight¬ 
ing the grade in each course by the number of hours per week that 
the course meets. Comparison of (2) and (7) shows that the mean 
of a frequency table may be regarded as a weighted mean in which 
the weights are the frequencies. 

9. Median. The median of a set of quantities is the middle 
value when the quantities are arranged in order of magnitude. 
Thus, the median of 1, 3, 8, 10, 20 is 8. Obviously when there is 
an even number of quantities there is no middle quantity; in such 
a case the median is usually defined as the number halfway 
between the middle pair, that is, the arithmetic mean of the 
middle pair. The median of 1, 3, 8, 10, 20, 25 would be 
^(8 + 10) « 9. 



14 


AVERAGES AND MOMENTS 


The serial number of the median of N quantities is (AT + l)/2. 
For 99 quantities the serial number is (99 + l)/2 = 50, and the 
median is the fiftieth quantity when they are arranged in order. 
There will be 49 quantities below the median and 49 above. For 
100 quantities, (N + l)/2 = 101/2 = 50.5, and the median is 
halfway between the fiftieth and the fifty-first quantity. 

To find the median in a frequency table, that is, in a grouped 
frequency distribution, we must resort to interpolation. A satis¬ 
factory interpolation formula, based upon the assumption that the 
quantities are uniformly distributed in the class interval in which 
the median lies, is 


Median = Z + 


UN + 


( 8 ) 


which reduces to the simpler form 


Median = {+ 


—- c 


( 9 ) 


in which 


I = lower limit of median class. 

N = total frequency. 

Nm— frequency of median class. 

Nb = sum of frequencies below median class, 
c = class interval. 


Let us apply this formula and find the median height of the 
group of men in Table 1. Here N = 346, and the serial number of 
the median is (346 + l)/2 = 173.5. Referring to the cumulative 
frequency table. Table lA, we see that there are 60 men below 66 
inches in height, and 191 below 68 inches, so that the median is in 
the class 66-68 inches, that is, the class whose lower limit is 66 
inches. Employing formula (9), we have 

Median =* I + c 

Nm 

- 66 + - X 2 - 67.73in. 


For a continuous distribution the median is the value of that 
abscissa corresponding to the ordinate which divides the area under 



HARMONIC MEAN 


15 


the frequency curve into two equal parts. Analytically, if Af is 
the median, then 


f f{X)dX = f 

•/-OO 


KX)dX 


( 10 ) 


10. Mode. The mode is the value which has the greatest 
frequency. In a grouped frequency distribution we shall simply 
refer to the modal class (the class 66-68 inches in Table 1, for 
example), as a precise determination of the mode is difficult; per¬ 
haps the only satisfactory way is to fit a theoretical frequency 
curve to the data and then to find the abscissa corresponding to 
its maximum point. 

11. Geometric mean. The geometric mean of N quantities is 
the Nth root of their product. It is best found by means of 
logarithms. If G is the geometric mean of Xi, X 2 , . . . , Xi\r, then 

( N \l/N 

nxA or (nx)'/'' (ii) 
logG = I OogXx + .. . + logX^) = isiogX (12) 


The geometric mean is useful in the construction of index 
numbers. 

12. Harmonic mean. Suppose that an automobile makes a 
200-milc trip, covering the first 100 miles at the rate of 50 miles an 
hour and the second 100 miles at the rate of 40 miles an hour. We 
can not find its average speed by taking H(50 + 40) = 45 miles 
per hour. The total time is 2 hours for the first 100 miles, plus 
2 y 2 hours for the second 100 miles, or a total of 4 hours. The 
average speed is therefore 200 4== 44^^ miles per hour. 

The same result can be obtained by employing the harmonic mean, 
which is the reciprocal of the arithmetic mean of the reciprocals 
of a set of quantities. If jFf is the harmonic mean of Xi ,. .. , X^, 
then 

III 

(13) 


H 


— S — 
N X 


Applying this formula to the above example, we get 
1 1/1 1\ 9 



16 


AVERAGES AND MOMENTS 


(A dot above a number following a decimal point means that the 
number is to be repeated indefinitely. Thus 1.83 means 1.8333.... 
Similarly 1.5l6 means 1.216216216 . . . .) 

13. Appropriateness of different averages. Different aver¬ 
ages may be used for different purposes. For example, in economic 
statistics it is often desirable to disregard extreme items, which 
may be due to unusual circumstances. In such cases the median 
has been found to be serviceable, as it is not affected by extreme 
items. 

It was stated above that the geometric mean is useful in the 
construction of index numbers. 

In section 12 was given an example in which the harmonic 
mean was found to be the appropriate one to use. 

It may be worth mentioning at this point that the arithmetic, 
geometric, and harmonic means of a set of quantities are always 
in the same order of magnitude. If we designate them by 
A, (?, and H respectively, then 

A^O'^E (14) 

the equality signs holding only if all the quantites have the same 
value. For example, if we have the numbers 2, 4, 8, we find 

.l = |(2 + 4 + 8) = j = 4f, 

(? = (2 X 4 X 8)>< = (64)^* = 4 

H 3X2^4 ^8/ 3 8 24’ 



and we see that the values are arranged in the order of magnitude 
specified by (14). 

If a man were purchasing a house and wanted to know the 
average time that it would take him to reach his place of business 
he would probably find the mode the most acceptable average to 
employ, since he would doubtless like to know how long it would 
usually take him to make the trip. 



PARTITION VALUES 


17 


Although the various averages other than the arithmetic mean 
are thus seen to have their merits, in certain cases, nevertheless, 
this familiar average has advantages which in most circumstances 
outweigh those of the others. Chief among these is its reliability 
in sampling. One of the important uses of an average calculated 
from a sample is that of estimating the corresponding average in 
the population, that is, the large group from which the sample is 
drawn. Suppose that the arithmetic mean of a sample is 10 and 
we estimate that the population mean is between 8 and 12 ; sup¬ 
pose, on the other hand, that the median is also 10 and we estimate 
that the population median is between 8 and 12 . We are more 
likely to be correct in our first estimate than in our second, unless 
we are sampling from an unusual population. 

14. Partition values. Quite analogous to the median are the 
guartileSf the deciles, and the percentiles. These are not averages, 
but might be termed partition values. The quartiles divide the 
frequencies into four'equal groups, the deciles into ten, and the 
percentiles into one hundred. The serial number of the first or 
lower quartile is {H){N + 1 ), that of the third or upper quartile is 
(M){N + 1 ). The second quartile is the median. The serial 
number of the A;th decile is (A;/10)(iV + 1), and that of the A^h 
percentile is (A;/100)(iV + 1 ). 

Formula ( 8 ) of section 9 may be generalized to give any parti¬ 
tion value. Thus, the partition which has a proportion p of items 
below it is given by the formula 


p(jV + 1) - iV, - I 

■*" iVp ^ 


the meanings of the S 3 anbols of which will be obvious if section 9 
is reread. 

To illustrate the application of this formula, let us com¬ 
pute the third quartile, Qz, of the distribution of heights in 
Table 1 . We find the serial number of the third quartile to be 
(5i)(346 + 1) = 260.25. From Table lA we see that Q 3 is in the 
class whose lower limit is 68 inches. Thus, 




69.348 in. 


102 



18 


AVERAGES AND MOMENTS 


16. Variance and standard deviation. The variance of a set 
of quantities is defined as the sum of the squares of their deviations 
from their mean, divided by their number, that is, their mean 
square deviation from their mean. Analytically the variance is 
defined by 

<r2 = is(Z-r)2 or is(X-r)*/ (15) 


for a discrete distribution, and by 

r)2/(X)dX + Jf{X)dX (16) 

^mX)dX if y f{X)dX = 1 (17) 

for a continuous distribution. If x — X — X, (15) and (17) 
assume respectively the simpler forms 


^2 — L 2x2 Qj. 

N 

<,2 x^f{x)dx 



(18) 

( 19 ) 


The standard deviationj <r, is the square root of the variance; 
that is, it is the root-mean-square deviation about the mean. 

It can be shown that the sum of the squares of the deviations of a 
set of quantities from any fixed value is a minimum when that fixed 
value is the mean. In other words, the variance is the minimum 
mean square deviation and the standard deviation is the minimum 
root-mean-square deviation. 

For computational purposes the formula 



is better than (15), from which it can easily be derived. In (20), 
X may be measured from any origin whatever. It is often con¬ 
venient and simpler to choose an origin near the mean. 

The method of computing the variance from a grouped fre¬ 
quency distribution will be illustrated in Table 4. Here the data 
of Table 1 have been used. 



VARIANCE AND STANDARD DEVIATION 


19 


TABLE 4 

Computation op the Variance op the Heights op a Group op Men 


Height 
in inches 

X 

Deviation 
from 67 
x' 

Frequency 

f 

x"/ 

x'V 

(a:' + 1)V 

59 

-4 

1 

- 4 

16 

9 

61 

-3 

2 

- 6 

18 

8 

63 

-2 

9 

-18 

36 

9 

65 

-1 

48 

-48 

48 

0 

67 

0 

131 

0 

0 

131 

69 

1 

102 

102 

102 

408 

71 

2 

40 

80 

160 

360 

73 

3 

13 

39 

117 

208 

Total. 


346 

145 

497 

1133 


The appropriate formula is 



in which x* is the deviation (expressed in terms of the class interval) 
from any origin, preferably the center of a class in or near which 
the mean lies, and c is the class interval. From Table 4 we find 

The final column of Table 4 affords a check, known as the 
Chmlier cheeky on the accuracy of the totals of that table, since 
S(x' + 1)2/ = Sa;'2/+ 2Sx'/+ S/, or 1133 = 497 + 290 + 346. 

The standard deviation is 

= 1.123 class intervals = 2.246 in. 

Incidentally we can find the mean (cf. section 7), 

X = X'+^c = 67 + ^X2 = 67.84 in. 

2^J o 4 d 

When a continuous variable is grouped into classes, an adjust¬ 
ment called Sheppard's correction is sometimes applied to the 

















20 


AVERAGES AND MOMENTS 


variance if the frequency distribution tapers off gradually at both 
ends. This correction is —c^/12. If <r* represents the variance 
after Sheppard's correction is applied, then 



( 22 ) 


From this value we may, of course, obtain the corresponding 
value <r* of the standard deviation. In the example given, 

ff** ■= 6.0432 - — = 4.7099 in.*, (t* = 2.17 in. 


T^e yariance and the standard deviation are measures of the 
d is persion of a set of quantities or of a frequency distribution. A 
distribution which is widely scattered will have a larger variance 
and consequently a larger standard deviation than a more compact 
group. 

It is sometimes desirable to compare the dispersion of two dis¬ 
tributions. For example, although we should expect a larger 
actual dispersion in the weights of rabbits than in the weights of 
mic' ' ariability in relation to size might not be so great. 
This bUfc^...cS dividing each standard deviation by the respective 
mean. The quotient, <t/X (usually expressed as a percentage), is 
called the coefficient of variation. It is independent of the unit of 
measurement and thus renders comparable not only the varia¬ 
bility of two distributions such as the weights of rabbits and the 
weights of mice, but also the variability of two different charac¬ 
teristics such as weight and stature. 

16. Mean deviation. Another measure of dispersion is the 
mean demaiion^ which is the mean of the absolute values of the 
deviations from any value A, Analytically, it is defined by 

islX-Al (23) 


for a discrete variable, and by 

y|Z-A|/(X)dX if J’fiX)dX 


(24) 




MOMENTS 


21 


for a continuous variable. As the mean deviation from the median 
is the minimum mean deviation, A is usually set equal to the 
median. 

For computing the mean deviation of a grouped frequency 
distribution the following formula* is recommended: 

Mean deviation = ^ P | a;' | / + (Nh -- Na)d + Nm(d^ + i)] (25) 

Here x' == deviation, in class-interval units, from center of class con¬ 
taining average from which deviations are measured. 
d == distance, in class-interval units, from center of this 
class to average. 

Nh = total frequency below this class. 

Na = total frequency above this class. 

Nm = frequency of this class. 

N = total frequency, 
c = class interval. 

17. Moments. The mean and the variance are special cases 
of moments, the kth moment of Xi, , , , , Xn being defined as 

The kth moment of a continuous distribution is 

=y'X*/(X)dX if y* fiX)dX = 1 (27) 

The A;th moment about the mean is 

Mfc ~ ~ Sx* or M* = ^ x^f(x)dx, X = Z — X (28) 

Obviously /li is the mean and M 2 the variance, that is, fx[ = X, 
M 2 - <r2. (The word “ moment ” will ordinarily be understood to 
signify moment about the mean.) 

We shall illustrate the method of transferring from moments 

* For its derivation and an illustration of its application the student is 
referred to H. L. Rietz (Editor), ** Handbook of Mathematical Statistios/’ 
Houghton Mifliin Co., Boston, 1924, pp. 20-31. 



22 


AVERAGES AND MOMENTS 


about any origin to moments about the mean, by developing the 
formula for the third moment. 

= 4 (sz* - 3rsr* + 3r*sz - ivx») 

N 

» - 3ZMi + 2^3 

“ Ma — 3 m^ 1 + 2/iI® 

The formulas for the first four moments are as follows: 

0 * 

, M2 = M2 “ Ml 

M3 *= M3 - 3M2Mi + 2/xi® (29) 

/14 = M4 — ^MaMl + 6 m2m1^ “ 3mJ^ 


others can be developed as shown above. All these formulas hold 
for continuous as well as discrete variables. It will be noted that 
the formula for m 2 is the same as ( 20 ). 

The adjusted moments, after Sheppard*s corrections for group¬ 
ing a continuous variable (see section 15) have been applied, are 


Ml = Ml = 0, M2 



M3 


* M2 2 , • ^ 

M3, M4-M4- 2 + 


(30) 


in which c is the class interval employed in the grouping. 
The following functions of moments are sometimes used: 


03 = ^ p (often denoted by VIS) (31) 


M4 M4 

04 = -2 “ ^ (often denoted by P 2 ) (32) 

The quantity 03 is a measure of the skevmess or lack of sym¬ 
metry of a frequency distribution. For a curve which has a longer 
tail to the right, such as the curve in Fig. 4, the skewness is positive; 



MOMENTS 


23 


when the curve stretches out more to the left the skewness is 
negative. 

For the normal distribution 04 has the value 3, for curves which 
come to a sharper peak than the normal curve it has a value 
greater than 3, while for curves that are flatter than the normal 
curve it has a value less than 3. The quantity a 4 — 3, the excess 
of the value of a 4 over the value for the normal distribution, is a 
measure of the kurtosis of the distribution. 

The computation of the first four moments of the frequency 
distribution of Table 1 is shown in Table 5. 

TABLE 6 


Computation op the First Four Moments op the Frequency 
Distribution of Heights of a Group op Men 


Height 
in inches 

H 

■ 

H 


x’»f 

x'V 

(x' - 1)V 

59 


1 

- 4 

16 

-64 

256 

625 

61 


2 

- 6 

18 

-54 

162 

512 

63 


9 

-18 

36 

-72 

144 

729 

65 


48 

-48 

48 

-48 

48 

768 

67 


131 

0 

0 

0 


131 

69 

1 

102 

102 

102 

102 



71 

2 


80 

160 

320 



73 

3 

13 

39 

117 

351 



Total... 

m 

346 

145 

497 

535 

2405 ! 

3013 


To verify the totals of Table 5 we may employ the Charlier 
check 

S(x' ± 1)V = ± 4Sx'3/ + 6Sx'2/ =t 4Sx7 + S/ (33) 

Here it seems more advisable to use the lower signs. We find 
3013 « 2405 - 4 X 535 + 6 X 497 - 4 X 145 + 346 
We now calculate 

MX - itt C - 0.41908 c. M* - C* - 1.43642 c» 

Ml - Hi • 1.54624 c», M 4 - ¥# - 6.95087 



















24 


AVERAGES AND MOMENTS 


lit - 1.43642 c* - (0.41908 c)* - 1.26079 c* - 6.04316 in.* 

lit - 1.64624 c* - 3(1.43642 c*)(0.41908 c) + 2(0.41908 c)* - - 0.11247 c» 

- - 0.89976 in.* 

Mi - 6.96087 c*-4(1.64624c»)(0.41908c)+6(1.43642c*)(0.41908<;)»-3(0.41908c)* 

- 5.77997 c* -92.47952 in.* 

/ia - (1.26079 - A) c* - 1.17746 c* - 4.70984 in.* 

M 4 - (6.77997 - i X 1.26079 + rfr) c* - 6.17874 c* - 82.86984 in.* 


EXERCISES 

The following observations were made on the vertical diameter of the 
planet Venus (the unit is 1 second of angle): 42.70, 42.66, 43.01, 43.48, 42.76, 
43.06, 43.63, 42.87, 41.60, 42.78, 42.95, 43.20, 43.18, 43.39, 43.10. Find 
(a) the arithmetic mean, (b) the median, (c) the standard deviation, (d) the 
mean deviation from the mean, (e) the mean deviation from the median. 

In the next five exercises find, for the tables mentioned, the following 
quantities: (a) arithmetic mean, (b) median, (c) upper and lower quartiles, 

(d) mode, (e) standard deviation, (/) mean deviation from median, (g) third 
moment, (h) fourth moment. 

2. Table A, page 8. 8. Table B, page 8. 

4. Table C, page 8. 6. Table D, page 8. 

6. Table E, page 9. 

7. From the results of exercise 6, page 7, calculate the values of the 
following quantities from the raw data, and also from the grouped data, and 
compare: (a) mean, (6) median, (c) standard deviation, (d) third moment, 

(e) fourth moment. 

8. The relative price of a commodity is the ratio, usually expressed as a 

percentage, of its price at a given time to its price during a specified base 

period. Find the geometric mean of the following set of relative prices: 

142, 166, 94, 175, 150, 114, 160, 95, 72, 119. 

9. Given N pairs of numbers {xi, XV),..., (Xj^, Xy). Let G* be the 
geometric mean of the quantities Xi,... , Xjv» G'' the geometric mean of 
Xi,..., Xjvi and 0 the geometric mean of the ratios Xi/Xi ,. .. , Xif/X^, 
Show that G - G'/G". 

10. A man motors from A to B. A large part of the distance is uphill, 
and he gets a mileage of only 10 miles per gallon of gasoline. On the return 
trip he makes 15 miles per gallon. Fi^ the harmonic mean of his mileage. 
Verify the fact that this is the proper average to use by assuming that the 
distance from A to B is 60 miles. 

11. In five different cities the numbers of streetcar tickets sold for a dollar 
are respectively: 8, 12, 10, 8, 9. Find the harmonic mean of the number of 
tickets sold for a dollar. 

12. Table G gives index numbers for various items entering the cost of 
living. Find an index of the cost of living by computing a weighted average 
of these items. The weights to be used are also given in the table. 



EXERCISES 


25 


TABLE G 



Index 

Weight 

Clothing. 

77.3 

13 

Food. 

74.6 

43 

Fuel and light. 

85.8 

6 

Housing. 

64.6 

18 

Sundries. 

Q2.5 

20 


18. Table H is the grade sheet of a university graduate, (a) Find his 
average grade by weighting each course taken by the number of units of credits 
which the course carries, (b) Find his average grade in this manner for each 
of the four years, (c) Verify the statement that his average grade for the 
entire course is equal to the weighted average of his average grades for fresh¬ 
man, sophomore, junior, and senior years, the weights being the total numbers 
of credits for these respective years. 









26 


AVERAGES AND MOMENTS 


TABLE H 

Gbadb Sheet of a UmvEBsiTT Student 



Ist Semester 

2nd Semester 


Course 

number 

Credits 

Grade 

Course 

number 

Credits 

Grade 


1 

3 

72 

6 

3 

70 




79 

7 

5 

80 

11 



85 

8 

4 

82 




72 

9 

3 

89 

u, 

5 

3 

64 

10 

3 

75 


11 

2 

71 

17 


76 

1 

12 

5 

86 

18 


78 

§ i 

13 

5 

77 

19 


88 


14 

2 

82 

20 


72 

& 

15 

2 

77 

21 

1 

74 

16 

1 

73 

22 

4 

63 


23 


60 

33 


75 


24 


70 

34 


76 


25 

1 

85 

35 


86 


26 


96 

36 


97 

1S 

27 


90 

37 

1 

91 

|t2 

28 


72 

38 

3 

60 


29 


87 

39 

1 

81 


30 

3 

85 

40 

3 

78 


31 

3 

74 

41 

3 

86 


32 

3 

85 

42 

3 

78 


43 

3 

78 

51 


74 


44 

2 

87 

52 


81 


45 

2 

70 

53 


72 

1 § 

46 

3 

69 

54 


70 


47 

1 

82 

55 

1 

84 


48 

1 

86 

56 

2 

71 


49 

2 

83 

57 

1 

85 


50 

3 

72 

58 

3 

65 

































CHAPTER III 


REGRESSION 

18. Regression or trend lines. If the pairs of numbers 
(Xy Y) = (0,1), (1,3), (3,2), (6,5), (8,4) are plotted, as in Fig. 5,* 
we see that they tend to lie on a straight line. A line drawn so as 
to pass near most of the points is called a regression line if X and Y 
represent characteristics such as height and weight, for example, 
or a trend line if X represents time and Y represents some such 
quantity as population or the price of a commodity. The object 
of such a line is usually to estimate one of the variables, say F, 
from the other, X. 

A standard procedure in fitting a line to the points is the 
method of least squares. By this method we write the equation 
of a line 

r ^ a+ bX (1) 

and then determine a and h so that the sum of the squares of the 
vertical deviations of the points from this line will be the least 
possible. That is, we minimize S(F — F')^. We could just as 
well minimize the sum of squares of the horizontal distances of the 
points from the line (or of their perpendicular distances), but this 
would in general give a different line. For a line of the form (1), 
X is termed the independent and F the dependent variable. 

If we have N pairs of values (Xi, Fi), . . . , {X^y Y^) and 
substitute each X in (1) we obtain N values of F'. We then wish 
to minimize ^ 

S(F - F')2 - S(F - a - 6X)2 

Employing the usual methods of the calculus, we differentiate 

* P. 30. (Such a figure is sometimes called a scatter diagram). 

27 



28 


REGRESSION 


partially with respect to a and then b and set the derivatives equal 
to zero, obtaining the normal eguaiiom 


_ If we write 


aN + 6SX * 27 
a2Z + 62X2 « 2X7 


D - 


N 2X 
2X 2X2 


iNr2X2 - (2X)2 


( 2 ) 

( 3 ) 


the normal equations have the solutions 


a 


D 


27 2X 
2X7 2X2 


(2X2)(27) - (2X7)(2X) 
i\r2X2 - (2X)2 


6 


D 


N 27 
2X 2X7 


N^XY - (2X)(27) 
iNr2X2 - (2X)2 


(4) 

( 6 ) 


The quantity 6 is called the coefficient of regression of Y on X. 
It measures the average increase of 7 per unit increase of X. If 
we determine a line of the form X' — a + 67, 6 is called the 
coefficient of regression ofXonY. The values of a and 6 in such an 
equation can obviously be obtained from (4) and (5) respectively 
by interchanging X and 7. 

If the variables are measured from their respective means the 
foregoing equations are materially simplified. Let x and y repre¬ 
sent deviations from the means of X and 7 respectively. Then 
2a; — 2^ — 0, and the normal equations reduce to oN » 0 and 
62a;2 » ^xy, with the solutions 


a — 0, 6 


^xy 

2*2 


( 6 ) 


Even if only X is measured from its mean we have simpler 
results, viz.. 


27 

N 


7, 6 = 


IxY 

Sa:* 


(7) 


In any case the equation of the line of regression can be written 
r - 7b(Xor j/’-bx (8) 

in which b may be obtained from (5), (6), or (7). 



REGRESSION OR TREND LINES 


29 


The mean product of deviations of X and Y from their respec¬ 
tive means is called their covariance. It may be written in any 
of the following forms: 


- ^(y- F) = isxy- T7 

= ^2xy = isXy (9) 


If we divide numerator and denominator of h in (6) by iV, we see 
that the coefficient of regression of F on X is the covariance of 
X and Y divided by the variance of X, and coefficient of regression 
of X on y is the covariance of X and Y divided by the variance 

of y. 

It is useful to know the sum of squares of deviations from the 
line of least squares. This can be found directly by substituting 
each X< in the equation of the regression line, computing the 
corresponding Fj, subtracting it from F*-, squaring the result, and 
finally summing. A shorter method is to make use of the formula 
Z(F - F')2 = 2F2 - aSF - 5SXF (10) 

which may be developed as follows: 

2(F - F')2 « S(F - a - 5X)2 

« SF(F - a - 5X) - a2(F - a ~ 6X) 

- 62X(F - a - 6X) 


The last two terms drop out, since the normal equations (2) are 
satisfied, and the remaining term reduces to the right side of (10). 

This sum of squares may be placed in another convenient form. 
If we replace a and 6 by their determinant values from (4) and (5), 


we find that 

2(F - FO^ 


2F2 2F 2XF 
2F N 2X 
2XF 2X 2X2 


-5- 


N 2X 
2X 2X2 


( 11 ) 


It is to be noted that the denominator is the determinant D and 
that the numerator is this determinant bordered by 2F2,2F, 2XF. 

If we measure both variables from their means we can reduce 
(11) to the simpler form 


S(y - 


2(F - F')2 * 


1 

2x2 


Sy2 

2x2/ 2x2 


( 12 ) 



30 


REGRESSION 


To illustrate the foregoing theory we shall fit a line of least 
squares to the set of points given at the beginning of this section. 
(We should, of course, usually have more pairs of values than five, 
but the regression line could be obtained by the process illustrated.) 
The summations necessary for forming the normal equations are 
to be found in Table 6. 

TABLE 6 




1 

1 

XY 

F 

K' 

Y -Y' 

(K _ r')2 

0 

1 

0 

0 

1 

1.646 

-0.646 i 

0.417316 

1 

3 

1 

3 

9 

2.022 


0.956484 

3 

2 

9 

6 

4 

2.774 

-0.774 

0.599076 

6 

5 

36 

30 

26 

3.902 

1.098 

1.205604 

8 

4 

64 

32 

16 

4.664 


0.427716 

18 

15 

no 

71 

55 

14.998 

0.002 

i 

3.606196 


The normal equations are 


41W -V XY 

and have the solutions 


5a + 186 = 15 
18a + 1106 = 71 


a = Ilf » 1,646, 6 = 1^*0.376 



The line of least squares is there¬ 
fore 

Y' = 1.646 + 0.376X 

Note that it can also be written 
in the form (8) 

F - 3 = 0.376(X - 3.6) 

The line is shown in Fig. 5. Using 
(10), we find that 


S(F - F)" = 65 - Iff X 15 - X 71 = HI = 3.6061947 

This sum of squares has also been computed by the other method 
in Table 6, but it is readily seen that the use of formula (10) is 
much to be preferred. 

















TRANSFORMATIONS 


31 


19. Transfonnatioiis. Certain t 3 rpes of equation can be 
reduced to linear form by a proper transformation of the variables. 
For example, some data, such as the numbers of bacteria in a colony 
at different times, conform to the exponential equation 

r « AB^ (13) 

which may be written in the alternative form 

r = (14) 

Taking logarithms of both sides of (13), we get 

logy' = logA + XlogB (15) 

or, if we set log 7' = y*, log A = a, log B = b, 

2 /' = a + (16) 

We can now fit a line of least squares to the pairs of values 
(X, y = log F) as in section 18. It should be noted that it is 
2(2/ — 2/0^ not 2(7 — 7')^ which will be minimized. This sum of 
squares is given by 

2(y — 2/')^ = — a2y — b'LXy (17) 

To discover whether a set of data conforms to the law (13) 
we may plot X and log 7 on ordinary graph paper, or X and 7 on 
semi-logarithmic paper, that is, paper on which one set of rulings 
is uniformly spaced and the other set logarithmically spaced. If 
in either case the points seem to lie along a straight line, we may 
fit a line of type (16). 

In Table 7, column 7 gives the number of bacteria per unit of 


TABLE 7 


X 

B 


y = log 0.17 


n 



0.8633 

-1.7266 



-1 

0.9690 

-0.9690 




1.0492 

0 

mm 



1.1173 

1.1173 

ill 


H 

1.2096 

2.4190 

Total. 

■ 

6.1983 

0.8607 














32 


REGRESSrON 


volume existing in a culture at the end of X hours. Plotting 
X and Y on semi-logarithmic paper, as in Fig. 6, using the arith¬ 
metic or uniform scale for X and the logarithmic scale for Y, we 



Number of hours 

Fig. 6 . —^Plot of X and Y of Table 7 on Semi-logarithmic Paper. 

see that the points lie approximately on a straight line. Figure 7 
shows X and log Y plotted on ordinary graph paper. 



Fio. 7.—Plot of X and y of Table 7. 

The numerical work of fitting an exponential curve is shown in 
Table 7. In order to make the logarithms somewhat smaller we 
use log O.IF rather than log Y, This merely moves the decimal 
point in Y one place and reduces the characteristic of log Y by 1. 




TRANSFORMATIONS 


33 


X = number of hours 

Y = number of bacteria per imit volume after X 
hours 

if ^y + hx = y + b(X - 2) 

— a) Sxw 0.8507 

b = = ^ = —= 0.08607 

Sx* Sx® 10 

y' = 1.03966 + 0.08507(X - 2) 

= 0.86952 + 0.08507X 

log 0.17' = log 7.405 + X log 1.216 

O.IF = 7.405(1.216)^ 

7' = 74.05(1.216)^ 

Notb. The minus sign over the 5 in the number 74.05 indicates that this 
number is less than 74.05. This knowledge is useful if we wish to cut down 
the number of digits in the number. Here the number to the nearest tenth 
is 74.0, not 74.1, 

The sum of squares of deviations is 
S(y - VT = S 2/2 - 1.03966Sy - 0.085072x2/ 

= 5.47704 ♦ - 1.03966 X 5.1983 - 0.08507 X 0.8507 
= 5.47704 - 5.40446 - 0.07237 
= 0.0002 

Another type of equation which can be reduced to linear form is 

Y' = AX^ (18) 

By taking logarithms of both sides we reduce this to the form 

logF - log A + Blog X (19) 

or, if we set log Y' = y', log A = a, log X = x, 

y* = a + Bx (20) 

Again, the constants a and B can be determined by the method of 
least squares. 

* Machine-calculated without recording the individual values of yK 



34 


REGRESSION 


To determine whether a set of data conforms to the law (18) 
we may plot log X and log Y on ordinary graph paper, or X and Y 
on logarithmic paper (both scales logarithmic), and note whether 
the points tend to lie on a straight line. 

The sum of squares of deviations is 

2(2/ — 2/')^ = 2^2 _ — Bhxy (21) 

20. Multiple regression. If we are concerned with more than 
two variables we may wish to estimate one of them from all the 
others. For example, we may wish to estimate Y from Xi and X2 
by means of a multiple regression equation 

F' » 60 + hiXi + 62X2 (22) 


The coefficients hi and 62 are the partial regression coefficients of 
y on Xi and X 2 respectively; 61 , for example, measures the 
average increase of Y per unit increase of Xi, when X 2 is held 
constant. Geometrically this equation represents a plane, and 
the ordinary procedure is to determine the l!>^s so as to minimize 
the sum of squares of vertical deviations (assuming the F-axis to 
be vertical) from the plane. We are led to the normal equations 

hoN + 61SX1 + 622X2 = 2F 
6o2Xi + 6iSXi2 + 622X1X2 - SXiF (23) 

60SX2 + 61SX1X2 + 622X2^ « ZX2F 


which have the solutions 






ISF 

2X1 

2X2 

1 

N 2F 2X2 


SXiXa 


2X1 2XiF SX1X2 

ISXaF 2X1X2 2X2* 

JJ 

2X2 2X2F 2X2* 

N 

2X1 

2r 


N 2X1 2X2 

2X1 

2X1* 

2XiF 

, 2> = 

2X1 SXi* 2X1X2 

2X2 

2X1X2 2X2F 


2X2 2X1X22X1 


(24) 


If we measure Xi, X 2 , F from their respective means, setting 

*= Xi — Xi, X2 X2 — X2, y ^ Y 7 

A 



MULTIPLE REGRESSION 


35 


the first normal equation yields &o = 0 at once and the others 
reduce to 


biExi + h2^xiX2 = Sxiy 
hiZxiX2 + h2^x% = ^X2y 


(25) 


From these, or from (24), we find 


Xxiy XxiX2 


Xx^ Xx\X2 

Xx2y Xxl 


XxiX 2 

Xxi Xxiy 


Xx\ XxiX2 

XxiX 2 Xx 2 y 


XxiX2 2^X2 


(26) 


The terms such as IixiX 2 , and Xxiy can be found from 
the relations 


Xx^ = SX2 


N ' 


Xxy = 


XXY - 


WiXY) 

N 


For the sum of squares of deviations from the regression plane 
we have 


X(Y - ry = 5)72 - boXY - biXXiY - 62SX27 


= Xy^ — biXxiy — b2Xx2y 


sy2 

XY 

XXiY 

XX2Y 


XY XXiY XX2Y 
N XXi XX2 
XXi XX\ XX1X2 
XX 2 XX 1 X 2 sxi 



N 

SXi 

XX2 

- 7 - 

XXi 

zx! 

XX1X2 


XX2 

ZX1X2 

xxi 


Xy^ 

^xiy 

^X2y 


Xxiy Xx2y 
XXi XX\X 2 
XXiX2 Xx2 



Xxi 2^XlX2 


XxiX 2 2^2 


(27) 


For the actual numerical solution of a set of normal equations 
it is better to use some such systematic method as the Doolittle 
method * or the method which we shall now illustrate by fitting a 

* See Frederick C. Mills, ** Statistical Methods Applied to Economics and 
Business,” Henry Holt & Co., New York, 1938, pp. 655-659. 



36 


REGRESSION 


linear regression of the form (22) to the data in the first three 
columns of Table 8. 


TABLE 8 


Xi 

X2 

B 


X 1 X 2 

X2^ 

XiY 

X 2 Y 

0 

4 



0 

16 


4 

1 

4 



4 

16 

3 

12 

3 

3 



9 

9 

6 

6 

6 

2 



12 

4 

30 


8 

0 

B 

64 

0 


32 


18 

13 

16 

110 

25 

46 

71 

32 


The normal equations and their solutions are shown below: 


Sum of 
coefficients 


(A) 

660 + I 861 + 1362 - 16 -0 

21 

(B) 

I 860 + IIO 61 + 2662 - 71-0 

82 

(C) 

136o + 266i + 4662 - 32 -0 

51 

(D) - 6 (B) - 18(A) 

22661 - 10962 - 86 - 0 

82 

(E) - 6 (C) - 13(A) 

-1096i + 6662 + 36-0 

-18 

(F) - 109(D) + 226(E) 

77662 - 1365 - 0 

-680 

(G) - (F) -f- 776 

62 - 1.7484 - 0 

-0.7484 


(D) 22661 * 109 X 1.7484 -f 85 = 276.6766, 6i * 1.2194 

(E) -1096i - - 66 X 1.7484 - 35 * - 132.9104, 6i « 1.2194 


(A) 66o“~ 18X1.2194-13X1.7484+16-- 29.6784, 5o 6.9857 

(B) 185o- -110X1.2194 -26X1.7484+71 - -106.8440, 6o « - 6.9368 

(C) 136o » - 26X 1.2194 -46X 1.7484+32 » - 77.1630, 6o » - 6.9366 


The method of obtaining equations (D) and (E) is indicated 
in the solution. For example, (D) is obtained by multiplying (B) 
by 5, the coefficient of 6o in (A), and (A) by —18, the negative of 
the coefficient of 5o in (B), and adding the results, thus eliminat¬ 
ing ho- If it seems desirable, these extra equations, which may 
be symbolically written 

(A') 18(A) and (B') « 6(B) 















MULTIPLE REGRESSION 


37 


may be inserted between (C) and (D). Similar equations may be 
inserted at the proper places in the solution. However, an experi¬ 
enced computer would probably prefer to obtain (D) and (E) 
directly, and this saves the labor of writing down the equations 
such as (A') and (B'). 

The column headed “ Sum of coefficients ” serves as an invalu¬ 
able check on the correctness of the work at each stage of the 
solution. The sum of the coefficients in equation (A) is 21, that 
of the coefficients in (B) is 82. Since (D) is obtained by multiply¬ 
ing (B) by 5 and (A) by —18 and adding, the sum of the coeffi¬ 
cients of (D) must be equal to 5 X 82 — 18 X 21 = 32. Simi¬ 
larly, the other steps may be checked by performing on the 
respective sums of coefficients the same operations that have been 
performed upon the equations. 

It will be observed that in all the above equations the coeffi¬ 
cients of the h^s are symmetric about their principal diagonal. 
Because of this it is not necessary to rewrite those coefficients 
below this diagonal. A compact form of arranging the foregoing 
solution (as far as 62 = 1.7484) is shown below. 


■ 

ho 

a. 


■■ 





n 

-15 

HB 

■G)M 






mm 



45 


61 

(D) 


226 

-109 

-86 

32 

(E) 

H 


56 

35 

-18 

(F) 



776 

-1366 

-680 

(G) 


■ 

1 

i 

-1.7484 

-0.7484 


In calculating the “ sum of coefficients column, labeled s, one 
must realize that an L-shaped path must often be taken, since 
those coefficients below the diagonal have not been written down. 
This path has its corner, or turning point, at the number in the 
diagonal. Thus we have by way of illustration in the present 
example^ 


















38 


REGRESSION 


for equation (B) for equation (C) 




The regression equation is 

y' = -5.936 + 1.219Xi + 1.748X2 
or 

F - 3 * 1.219(Xi - 3.6) + 1.748(X2 - 2.6) 

R. A. Fisher,* following Gauss, has given the following method 
of handling the normal equations: Let us replace the 
equations (23) by the three sets of equations 

For j = 0, 1, 2 

cojN "t” cijSXi -f- C2 j2X2 — 1> 0, 0 

coi'ZXi + + C2,SXiX2 = 0, 1, 0 

co] 2)X2 + Cl,2X1X2 + €2,2X2^ “0, 0, 1 

with Cij = Cji, (It is equivalent to replace equations (25) 

Cn2a;i^ + Ci22a:ia:2 = 1 ci22a;i^ + 022^X1X2 =® 
Cii 2 :ri:r 2 + Ci 22 a: 2 ^ — 0 012^x1x2 4 " 022^x2^ = 

where the a;'s are, as usual, deviations from means.) If D is the 
determinant of the coefficients of (28), then c<,- = where 

D,/ is the cofactor of the element of D which is the coefficient of 
Cii, that is (—l)*'*’^ times the minor obtained from D by striking 
out the row and the column occupied by the coefficient of c,*,*. f 

• “Statistical Methods for Research Workers,” Oliver A Boyd, Edinburgh 
and London, section 29. 

t See Maidme B5cher, “Introduction to Higher Algebra,” The MacmiUan 
Co., New York, 1907. 


normal 


(28) 

by 

0 

j (29) 



MULTIPLE REGRESSION 


39 


In the present example we have the set-up and solution shown 
below. 



co2^- 6.3226, C12 = 0.7032, C 22 = 1.4681 

(D) 226 coi = 109(-6.3226) - 18 = - 707.1634, coi = - 3.1290 

(D) 226 cii = 109 X 0.7032 + 6 « 81.6488, di * 0.3613 

(E) - 109 coi » - 66(-6.3226) - 13 - 341.0656, coi = - 3.1290 

(E) - 109 cii * - 66 X 0.7032 + 0 = - 39.3792, cii « 0.3613 

(A) 6do=“ 18(-3.1290)-13(-6.3226)+l *139.6158, coo - 27.9032 

(B) 18coo-~110(-3.1290)-26(-6.3226)+0 * 502.2650, coo = 27.9031 

Here we have solved all three sets, of three equations each, at the 
same time. The solution follows very closely that previously 
given. The constant term in each equation, however, is different, 
and naturally the s or “ sum of coefficients columns will be dif¬ 
ferent. For example, (A) really consists of three equations, each 
corresponding to a fixed value of j. When j = 0, we have 

5coo + IScoi + 13 co 2 ““1=0, s = 35 
Whenj = 1, we have 

5coi + IScii -f- 13 ci2 = 0, s = 36 
When j = 2, we have 

5co2 4" 18ci2 4” 13c 22 *=0, s = 36 
The Vs can be found from the c^s by means of the relations 
60 = cooSF 4 " coiSXiF 4 " C02SX2K 


61 - coiSr 4 - cnZXiY 4- ciaSXaF (30) 

62 = co22jy 4 “ ci2SXiy 4“ c22SX2y 






















40 


REGRESSION 


In the present problem we have 
ho = 27.9032 X 16 - 3.1290 X 71 - 6.3226 X 32 = - 6.9342 
bi « - 3.1290 X 16 + 0.3613 X 71 + 0.7032 X 32 = 1.2197 
b2 * - 6.3226 X 16 + 0.7032 X 71 + 1.4681 X 32 = 1.7474 

It will be noted that there are slight discrepancies between these 
values and those previously obtained. These are due to the fact 
that the c's have not been carried to a sufficient number of decimal 
places to make the values of the 6^s that we have just obtained 
quite as accurate as the previous values. 

One advantage of the method just described is that, if we wish 
to find the regression equation for a new set of F’s with the same 
set of Xi*s, X 2 ^Sj etc., we can make use of the c^s already found 
and determine the b’s from (30) with little extra labor. Suppose, 
for example, that we have 26 stations at which we have observed 
the first flowering dates of 10 species of plants over a period of 12 
years. Let variables Xi and X 2 be the altitude and latitude 
respectively of the stations, X3 the year. If Fi is the first flower¬ 
ing date of the first species, F2 of the second species, and so on, 
we might want to determine ten regression equations (one for 
each species) of the type 

r = bo + biXi + b2X2 + hsXz 

Instead of deriving each of these separately we could derive the 
Cij for the altitude and the latitude of the 26 stations and for the 
12 years. These would be 

Coo, Coi, C02, C 03 
Cll, C12, Ci 3 
C22, C23 
C 33 

since ca = Ca, and from them we could find the 10 b^s by using 
formulas similar to (30). 

Another advantage of this method will be evident when we 
study the significance of regression coefficients. 



CURVILINEAR REGRESSION 


41 


It is not difficult to extend the foregoing discussion of multiple 
regression to the case of k independent variables Xi, , X*. 
The equation of the regression may be written 

r = 60 + 61X1 + 62X2+ ••• +6*X* ( 31 ) 

and the normal equations are 

boN + 61SX1 + 62SX2 + • • • + hZXk = 2)7 

6o2:Xi + 6iSXi2 + 62SX1X2 + • • • + 6jfeSXiXfc = SXiV 

60SX2 + 61SX1X2 + 62SX22 + • • • + 6fcSX2Xik = 2X27 (32) 


6 o 2 X, + 6 i 2 XiX, + 622X2Xife + • • • + b,ZXu^ = 2 X *7 
The sum of squares of deviations is 
2(7 - vy = 

272 - 6 o 27 - 6 i 2 Xi 7 - 622X27- bkZXkV ( 33 ) 

21. Curvilinear regression. The X* do not actually have to 
be independent. For example, if we wish to fit a regression equa¬ 
tion of the type 

7 ' = 60 + 61X + 62Z + 63X2 + 64XZ + 66 Z 2 

we can set X = Xi, Z = X2, X2 = X3, XZ = X4, Z2 = Xs, and 
proceed as above. 

One important case is that in which the regression function is 
a polynomial 

7 ' = 60 + 61X + 62X2 + ... + bkX^ ( 34 ) 

For purposes of illustration we shall fit a second-degree equation 
(geometrically a parabola) to the points of section 18 (see Table 6). 
The normal equations are 

boN + 6i2X + 622 X 2 = 27 
6o2X + 6i2X2 + 622 x 3 = 2X7 
6 o 2 X 2 + 6 i 2 X 3 + 622X4 = 2X27 


( 35 ) 




42 


REGRESSION 


The various summations needed will be found in Table 9. 


TABLE 9 


X 


X* 

X^ 

B 

XF 


jr2 

0 

0 

0 

HI 

1 

0 

0 

1 

1 

1 

1 


3 

3 

3 

9 

3 

9 

27 

81 

2 

6 

18 

4 

6 

36 

216 

1296 

6 

30 

180 

26 

8 

64 

612 

4096 

4 

32 

266 

16 

18 

110 

766 

6474 

16 

71 

467 

66 


The solution is shown below. 



60 

61 

62 


s 

(A) 

5 

18 

110 

- 16 

118 

(B) 


110 

766 

- 71 

813 

(C) 



6,474 

-467 

6,883 

(D) 


226 

1,800 

- 86 

1,941 

(E) 


1800 

16,270 

-636 

16,436 

(F) 



211,020 

9490 

220,610 

(G) 



1 

0.04497 

1.04497 


6 (B) - 18 (A) 

6 (C) - 110 (A) 

226 (E) - 1800 (D) 
(F) 211,020 


(G) 6a - - 0.04497 

(D) 226 6i * - 1,800(-0.04497) + 86 * 166.9460, 6i * 0.73427 

(E) 1800 6i =* - 16,270(-0.04497) + 636 = 1321.6919, 6i * 0.73427 


(A) 6 6o=- 18X 0.73427-110(-0.04497)+16 * 6.72984, 6o « 1.346968 

(B) 18 6o=* -110X 0.73427 -766(-0.04497)+71 -24.22762, 6o « 1.346979 


The regression equation is therefore 

r = 1.3460 + 0.73427Z - 0.04497^2 (36) 

The sum of squares of vertical deviations from the regression 
curve is 

2(7 - 7')2 = 272 _ 5^27 ~ 6i2X7 - 622x27 
* 55 - 1.3460 X 15 - 0.73427 X 71 + 
0.04497 X 457 = 3.22812 


Comparing this value with that found for the sum of squares of 
deviations from the straight line fitted in section 18, viz., 3.60619, 






















EXERCISES 


43 


we see that although it is less, as is to be expected, it is very little 
less. That is, the parabola fits the points very little better than 
the straight line. The small value of &2 means that the term 
has little influence on our estimated value of V. 

When the degree of the pol 3 momial to be fitted has not been 
decided upon in advance, it is possible to fit successively a constant 
(the mean), a linear function, a quadratic, and so on, each function 
being obtained from the preceding by the addition of a new term. 
The technique of this process is fully explained by Fisher.* 

EXERCISES 


1. The length of a spiral spring under various loads is given in the following 
table: 


Load in grams, X. 


10 

15 

20 

25 

30 

Length in centimeters, 7. 

■ 

8.12 

8.95 

9.90 

10.9 

11.8 


(a) Plot these values, and find the equation of a least squares line of the form 
7' - a + bX. (b) Find 2(7 - 

2. Table J gives the intelligence quotients and the scores on a reading 


TABLE J 


Intelligence Quotients and Scores in a Reading Vocabulary Test 
OF A Group op Fifth-Grade Pupils 


Pupil 

I. Q. 

Reading 

vocabulary 

Pupil 

I. Q. 

Reading 

vocabulary 

1 

140 

64 

■1 

112 

51 

2 


54 

mm 

105 

44 

3 


55 


105 

35 

4 


56 

17 

103 

44 

5 


46 

18 

103 

29 

6 


51 

19 

96 

33 

7 


61 

20 

94 

30 

8 

118 

42 

21 

93 

31 

9 

115 

41 

22 

92 

33 

10 

114 

47 

23 

76 

14 

11 

114 

43 

24 

70 

12 

12 

113 

56 

25 

66 

15 

13 

113 

48 





* ** Statistical Methods for Research Workers," sections 27-28.1. 










44 


REGRESSION 


vocabulary test of a group of fifth-grade pupils. (Yates, A. M. Thesis, 
Washington University, 1937) (a) Make a scatter diagram for these data. 

(&) Find the equation of the line of regression of reading vocabulary on I. Q. 
(c) Find the sum of squares of deviations from this line. 

8. Plot the data of Table K, and fit a straight line of trend to them. 

TABLE K 

Rate per 100,000 Population of Automobile Fatalities in the 
United States {Statistical Abstract of the United States) 


Year 

Rate 

Year 

Rate 

1911 

2.2 

■Bl 

14.7 


2.9 

WBM 

16.6 

KB 

3.9 

mSm 

17.1 

1914 

4.3 

KH 

18.0 

1916 

6.9 

1927 

19.6 

1916 

7.3 

1928 

20.8 

1917 

9.0 

1929 

23.3 

1918 

9.3 

1930 

24.6 

1919 

9.4 

1931 

26.2 

1920 

10.4 

1932 

21.9 

1921 

11.4 

1933 

23.3 

1922 

12.4 

1934 

26.8 


TABLE L 

Population op the United States 


Year 

Population 
in millions 

Year 

Population 
in millions 

1790 


1870 

38.6 

1800 


1880 

60.2 



1890 

62.9 


9.6 

1900 

76.0 

1830 

12.9 

1910 

92.0 

1840 

17.1 

1920 

106.7 

1860 

23.2 

1930 

122.8 

1860 

31.4 

1 



4. (a) Plot the population of the United States for the census years 
(Table L) on ordinary graph paper. (6) Plot these same data on semi- 
















EXERCISES 


45 


logarithmic paper, or plot the logarithms of the population against equaDy 
spaced time values, (c) Fit a straight line of trend to the period 1880-1930, 
and calculate the theoretical values of the population for the census years 
of this period, (d) Fit an exponential curve to the period 1790-1900, and 
calculate the theoretical values. 

6. (a) Fit a parabola to the cotton production data of Table M. (b) Draw 
the parabola and plot the data, (c) Find the sum of squares of deviations 
from the curve. 


TABLE M 

Cotton Production in the United States 


{Survey of Current Business) 


■ 

Production 
(millions of bales) 

Year 

Production 
(millions of bales) 


17.1 

1935 

mmm 


13.0 

1936 



13.0 

1937 


■III 

9.6 


Hi 


6. The following pairs of values of the pressure p and volume of a gas 
were measured in a laboratory experiment. Find an equation of the type 
p' ^ Av^ connecting the two variables. 


V 

B 

B 

2 





5 

V 

106 

89.0 

70.8 

49.0 

40.0 

30.0 

24.6 

19.5 


7. Fit an exponential curve to the data of Table 2, page 3. 

8. (a) Find a multiple regression equation of grade-point average on the 
other variables of Table N. (b) Find the sum of squares of deviations from 
the regression function. 
















46 


HEGEESSION 


TABLE N 

ScoBBS OF 100 Students in an Intelligence Test, Reading Compbb^ 

HENSION, AND READING RaTS, AND GrADE-PoINT AyERAGE FOR FlRST 
Semester in University 


Intelli¬ 

gence 

Beading 

compre¬ 

hension 

Reading 

rate 

Grade- 

point 

average 

IntelU- 

gence 

Reading 

compre¬ 

hension 

Reading 

rate 

Grade* 

point 

average 

275 

153 

29 

1.2 

295 

180 

41 

2.4 

181 

132 

22 

0.0 

152 

107 

18 

0.6 

152 

144 

34 

0.0 

214 

198 

45 

0.2 

102 

134 

31 

1.0 

■ a 

139 

29 

0.0 

158 

145 

34 

1.2 


111 

28 

1.0 

228 

179 

47 

2.8 

mlm 

149 

38 

0.0 

273 

172 

38 

2.2 

mlm 

143 

25 

1.0 

240 

148 

30 

0.8 

■ a 

122 

20 

0.4 

235 

100 

29 

2.0 

116 

83 

22 

0.0 

131 

124 

20 

0.0 

173 

144 

37 

2.6 

247 

168 

25 

1.0 

230 

179 

37 

2.6 

187 

90 

22 

0.0 

174 

114 

24 

1.8 

150 

120 

29 

1.2 

177 

133 

32 

0.0 

234 

183 

41 

1.2 

210 

151 

26 

0.4 

215 

178 

41 

0.0 

236 

132 

29 

1.8 

237 

139 

24 

1.4 

198 

100 

34 

0.8 

182 

159 

31 

0.0 


178 

38 

1.0 

177 

162 

29 

0.0 


153 

40 

0.2 

105 

101 

48 

1.0 


144 

27 

2.8 

317 

192 

30 

2.0 


172 

44 

1.4 

202 

126 

28 

2.0 

136 

109 

32 

0.2 

101 

131 

17 

0.8 

183 

142 

20 

0.4 

181 

130 

27 

0.0 

223 

107 

50 

1.4 

170 

120 

20 

0.4 

100 

116 

24 

0.0 

271 

178 

25 

3.0 

211 

128 

18 

0.8 

174 

148 

22 

0.0 

151 

93 

20 

0.4 

101 

130 

29 


231 

171 

26 

2.2 

231 

150 

32 

mSM 

135 

152 

26 

1.4 

229 

174 

33 

2.0 

140 

154 

19 

1.2 

152 

no 

23 

0.7 

227 

149 

35 

1.4 

97 

113 

20 


204 


20 

1.4 

182 

105 

38 


223 


18 

1.4 

247 

108 

27 


142 

101 

22 

0.8 

232 

191 

44 


176 

127 

22 

0.8 

240 

155 

29 


238 

104 

27 

2.0 

180 

101 

25 


208 

177 

40 

2.0 

247 

158 

25 


103 

139 

33 

0.2 

188 

148 

18 

0.0 

195 

140 

38 

0.0 

233 

144 

27 


184 

143 

32 

0.8 

164 

113 

29 

■o 

192 

119 

22 

0.8 

270 

194 

50 


121 


34 

0.6 

220 

176 

24 

mxM 

310 

176 

42 

2.6 

200 

130 

33 

0.8 

234 

183 

41 

1.2 

170 

185 

40 

0.0 

146 

112 

18 

0.0 1 

240 

102 

50 

1.0 

261 

175 

35 1 

2.0 

192 

150 

32 

0.0 

175 

168 

30 ' 

1.2 

167 

119 

30 

0.4 

233 

182 

34 

1.0 

223 

164 

29 

0.4 

201 

156 

25 

2.4 

221 

152 

14 

1.2 

242 

187 

49 

1.4 

202 

102 

44 

0.0 

134 

170 

48 

0.8 




























CHAPTER IV 


CORRELATION 

22. Coefficient of correlation. As we realize from the dis¬ 
cussion of regression, there are many instances in which an increase 
in one variable is in general accompanied by an increase in another. 
In other words, large values of one variable tend to be associated 
with large values of a second, while small values of the first tend 
to be associated with small values of the second, without the 
existence of a strict mathematical relationship between the two. 
In such a case the variables are said to be correlated. Of course an 
increase in one variable may be accompanied by a decrease in the 
other, that is, large values of each variable may tend to be asso¬ 
ciated with small values of the other. This situation is called 
inverse correlation. 

The most widely used measure of correlation is the coefficient 
of correlation. For the case of N pairs of values of the variables 
X and Y the coefficient of correlation may be defined as 

^ S(X - X){Y - 7) ^ Sxy ^ 

in which <rx and o-y are the standard deviations of X and Y respec¬ 
tively. It is to be noted that it is entirely symmetric with respect 
to X and Y. Moreover, it is entirely independent of the units in 
which X and Y are expressed, and may have any value between 
— 1 and 1. If large values of the two variables tend to be associ¬ 
ated, the factors X — X and F — F in the numerator of r will 
usually have like signs and the sum of products will accumulate, 
giving a value of r near to 1. If there is little correlation between 
X and Y the products (X — X)(F — F) will sometimes have a 
positive sign, sometimes a negative, thus tending to neutralize 
each other and resulting in a value of r near zero. Inverse corre- 

47 



48 


CORRELATION 


lation 3 rields a negative value of r. The coejfficient r will have the 
value 1 or — 1 only in the case of perfect correlation, that is, when 
there exists a definite linear relation between X and Y, 

For the actual computation of r, (1) may be placed in the form 

_ sxr - (sx)(sy)/j\r 

- (sz)7JV]><[SF2 _ (sr)v^^ ^ ’ 

in which the variables may be measured from any origin and in ' 
terms of any unit. If one of the variables is measured from its 
mean, r assumes one of the following simpler forms: 

_^_SXy_ 

’’ _ (SF)VJ\r]^ [SX2 - ^ ’ 

Ji both variables are measured from their means, the resulting 
form of r is shown in (1). 

Let us compute r for the following pairs of numbers: 


X 

0, 1, 3, 6, 8 

Y 

1, 3, 2, ff, 4 


From Table 6 we get 

= 18, sy = 15 , SX 2 = no, sxy = 7 i, sy^« 55 

Hence, using (2), we find 

_ 71 ~ 18 X 15/5 

’’ " [110 - (18)V5]^[55 - (15)76]^ 

17^ 

=- - - u “ 0.80 

(45.2 X 10)>< 

23. Connection between correlation and regression. The 
equation of the line of regression of y on X (see section 18) is 




( 4 ) 




CORRELATION AND REGRESSION 


4d 


Similarly, the equation of the line of regression of X on F is 




From (4), (5), and (1) we see that 


S2/2 


(5) 


h h = r2 

XY YX 


( 6 ) 


and that r is consequently equal to the square root of the product 
of the two regression coefficients, that is, to their geometric mean. 
These regression coefficients will both have the same sign, and r 
should be given their common sign. 

It is also easily shown that 


h 


YX 



(7) 


Another instructive way of regarding the correlation coef¬ 
ficient is as follows: Suppose we have fitted a line of regression, 
y' = a + 6X, to a set of points. The points will cluster more 
closely about this line than they will about the horizontal line 
drawn through the mean of the Y's. That is, unless 6 = 0, the 
sum of squares of deviations of the from the estimates F' will 
be less than the sum of squares of deviations from the mean. The 
quotient 

s(F - r)^ 

S(F - F)2 

will therefore be less than 1. The more closely the points are 
clustered about the regression line the smaller will the numerator 
be. The quotient (8) is small if X and Y are closely correlated, 
and approaches its maximum value 1 if they are practically inde¬ 
pendent. Its value, which varies between 0 and 1, is then a 
sort of inverse measure of the correlation. If it were subtracted 
from 1 it should give a measure of the correlation. Actually it 
can be shown that 


j,2 


2(7 - Yy 
^ 2(7 - F)2 

a27 -I- 62X7 - (XY)yN 
272 _ (2;7)2/i\f 


(9) 

(10) 



50 


CORRELATION 


The sign to be prefixed to r is that of the regression coeffi¬ 
cient b. 

Formula (9) is most readily proved if we use deviations from 
the mean. Thus, for the right side of the equation we can write 


j _ s(y - i/y _ 1 _ 


1-1 + «>S = 


S2/2 




Fonnula (10) follows from (9) above and from (10) of section 18 
(page 29). 

Formula (9) may, of course, be used to find the sum of squares 
of deviations from the line of regression if r is known. 

To the data of section 22 we have previously (section 18) fitted 
a regression line 7' = a + hX, finding a = 1.646, h = 0.376. Also 
we have, from section 22 or from Table 6 , 


XY = 15, XXY = 71, SF 2 « 55 


Thus from ( 10 ) 


r2 = 


1.646 X 15 + 0.376 X 71 - (15)V5 
55 - (15)75 

24.690 + 26.696 - 45 


10 


0.6386 


r = 0.80 

24. Correlation table. When the pairs of values of two vari¬ 
ables are numerous they are often grouped into a correlatton tables 
which is a table of double classification. Table 10 is a correlation 
table, the data, however, being fictitious to admit of an easy illus¬ 
tration of how r may be calculated from such a table. The num¬ 
bers in the body of the table are frequencies. For example, the 
number 5 near the middle of the table indicates that there are five 
individuals whose heights are between 66 and 69 inches and whose 
weights are between 125 and 150 pounds. 



CORRELATION TABLE 


61 


TABLE 10 


Correlation’ Table of Heights and Weights of a Group of Men 


Weight in 
pounds 

Height in inches 

60-63 

63-66 

66-69 

69-72 

72-76 

100-126 

2 

1 




126-160 

2 

3 

6 

1 


160-176 


2 

4 

1 

2 

176-200 



1 

1 



Since the value of r is quite independent of the units in which 
either variable is measured, we find it most convenient to measure 
each in terms of its class interval, and from the middle of some 
centrally located class as origin, as shown in Table 11. 


TABLE 11 
Computation of r 




B 

0 

1 

2 

fyi 

y/r 


(I'+DVr 

2 

2 -1 





3 

-3 



-1 



0 

2 

0 -2 

3 

0 -1 

s 

0 0 

1 

0 1 


11 




1 


2 

-1 -2 

4 

0 -1 

1 

1 0 

2 

2 1 

9 

9 

9 

36 

2 



1 

0 -2 

1 

2 -1 


2 

4 

8 

18 

fx 

4 

6 


3 

2 

25 


20 

65 


-8 

-6 


3 

4 

-7 


^Sx 

16 

6 


3 

8 

33 

(X+l)Vx 

4 


Bra 

12 

18 

44 


TABLE llA TABLE IIB 


XY 

IHRyilll 

XYfxr 

-1 

2 

-2 

0 

16 

0 

1 

2 

2 

2 

6 

10 


26 

10 



fxY 

(X-Y)%y 

0 

7 

0 

1 

13 

13 

4 

6 

20 


26 

33 



























































52 

CORRELATION 



Chablibb Checks 


SX» -33 

zy* -20 

ZZ* - 33 

2XX -~14 

22r = 20 

-2ZXr - -20 

iST « 26 

AT - 26 

zr* - 20 

S(X + D* = 44 

2(r + D* - 66 

Z(Z - y)* - 33 


sjcr - 2x-zr/Ar 



[XX^ - (SX)ViV'l^[SF2 _ (SDVA^l^ 
10 ~ (-7)-10/25 


“ (33 - 49/26)^(20 - 100/26)^^ 

12.80 

“ (31.04 X 16)H “ 

In this table the frequencies are shown in the middle of the 
various compartments. The small number in the lower left-hand 
corner of a compartment is the product of X and Y for that com¬ 
partment; the number in the lower right-hand comer is the dif¬ 
ference X — F. The first of these is used in computing the sum 
of products, XXY; the second, in making a third Charlier check, 
viz., 

S(X - F)2 * SX2 - 2SXF + SF2 (11) 

Table llA shows the computation of XXY; Table IIB shows 
the computation of S(X — F)^. In the latter the frequencies cor¬ 
responding to (X — F)2 = (±1)2 can, of course, be grouped 
together, as can those corresponding to (X — F)2 = (±2)2. 

To use formula (10) in the computation of r we must fit a line 
of least squares, F' = a + hX, The normal equations with their 
solutions are 

26a - 76 * 10 a = 0.61546 

-7a -t- 336 = 10 6 = 0.41237 


and from (10) we have 

» 0.51546 X 10 + 0.41237 X 10 -4 ^ ^ 


20-4 


= 0.67 


25. Correlation ratio. We have seen in section 23 that the 
square of the coefficient of correlation can be obtained by subtract¬ 
ing from unity the fraction whose numerator is the sum of squares 



CORRELATION RATIO 


63 


of deviations of the F's from the line of regression and whose 
denominator is the sum of squares of deviations of the Y^s from 
their mean. If the regression of F on X is not linear, that is, if 
the straight line of regression does not pass near to the points of the 
scatter diagram, we can get a better measure of the correlation by 
means of the correlation ratio. 

Suppose that instead of fitting a straight line of regression to 
the data of a correlation table we find the mean of each colmxin, 

X 



Fig. 8. —Correlation Diagram. (Note that the F-axis is positive downward 
and that consequently the regression line slopes downward on the graph 
although the coefficient of regression is positive.) 

plot these means on a graph, and then draw a broken line through 
the resulting points. See Fig. 8, which illustrates the data of the 
correlation table discussed in section 24. The dots in this figure are 
the individuals; the small circles indicate the means of the respec¬ 
tive columns. By analogy with the square of the correlation coef¬ 
ficient, we define the square of the correlation ratio to be 1 minus 
the fraction whose numerator is the sum of squares of deviations 
from the means of the columns and whose denominator is the sum 
of squares of deviations from the general mean. 

To express the correlation ratio analytically, suppose that we 
have a correlation table consisting of k columns. Suppose that in 
the column corresponding to a fixed X there are Nx values of F, 

Fxi, Fx 2, • • • > Fx* . . ., Yxnx 





54 


CORRELATION 


whose mean value is Fx. If F is the general mean of the Ps, tha 
correlation ratio of F on X is the square root of 


Xfc NX 


2 

.2 _ , X - X , <-l 

Irx - 1 - xk Kx 

2 j 


nrx 


( 12 ) 


This can be shown to be equal to the ratio of the weighted sum of 
squares of deviations of the means of columns from the general 
mean (the weights being the numbers in the columns) to the sum 
of squares of deviations from the general mean, that is, 


•nix 


^NxjTx - F)2 

2)(F - F)2 


(13) 


For purposes of computation (13) is superior to (12), although the 
following equivalent form is perhaps still better: 



Here N - Ni + . , . + Nk = total frequency, and SF and SF^ 
indicate summation over the entire table, viz., for both the index i 
and the index X. That is, the denominators in (14), (12), and 
(13) are the same. 

We shall illustrate below the calculation of rj from the correla¬ 
tion table used above (Table 10). It seems more appropriate here 
to use subscripts —2, — 1,0,1,2 rather than 1,2,3,4,5. 

1st column (X = — 2) 


lV -2 * 4' 

r _2 = - 2/4 ==- 0.6 
s(r - r_2)2 = 2 - (-2)V4 -1 


Y 

/ 

r/ 

rV 

-1 

2 

-2 

2 

0 

2 

0 

0 


4 

-2 

2 




CORRELATION RATIO 


55 


2 nd column (X = — 1) 


N-i = 6 

7-1 = 1/6 = 0.l6 

2(7 - r_i)* = 3 - 176 = 2.8S 


3rd column (X = 0) 


No = 10 

Fo = 6/10 = 0.6 

2(F - Fo)* = 8 - 6710 = 4.4 


4th column (X = 1) 


iVi = 3 
Fj =3/3 = 1 

S(F - Fi)* = 5 - 373 = 2 


6 th column (X = 2) 

N2=2 
72 = 1 
2(F - Fa)* = 0 


Y 

/ 

Yf 

ytf 

1 

2 

2 

2 


Y 

/ 

Yf 

Y^f 

0 

1 

0 

0 

1 

1 

1 

1 

2 

1 

2 

4 


3 

3 

5 


Y 

/ 

T/ 

KV 

0 

5 

0 

0 

1 

4 

4 

4 

2 

1 

2 

4 


10 

6 

8 


r 

/ 

Yf 

TV 

-1 

1 

-1 

1 

0 

3 

0 

0 

1 

2 

2 

2 


e 

1 

3 







56 


CORRELATION 



^ (-2)2 1® 6 ® 32 2* (-2 + 1 + 6 + 3 + 2)a 

« 1 + 0.16 + 3.6 + 3 + 2 - 4 = 5.76 

(SF)2 

21"2 - = 2 + 34-8 + 5 + 2 

N 

= 20 - 4 = 16 

From (14), 4x = ^ = 0.3604, 

With a calculating machine, 27 ^ can be computed directly, in 
which case the columns headed Y^f are unnecessary. These 
columns are, however, necessary if we use formula ( 12 ). Ordinar¬ 
ily the correlation ratio would not be calculated by ( 12 ), but to 
emphasize the analogy between the correlation ratio and the corre¬ 
lation coelficient, and also to verify the equivalence of (12) and (13) 
or (14), we shall carry through the calculation for this particular 
example. We find 

{Yxi - Fz)2 = 1 + 2.83 + 4.4 + 2 + 0 =10.23 

X»Xi 

tt (Yxi — = 16, as before. 

f IQ OQ 

From ( 12 ), Jx « 1 - ~ 0-6396 = 0.3604 

16 

If we had used the means of rows instead of the means of 
columns we should have found rjxY, the correlation ratio of X on 7. 
The formula for rjxr is obtained from any of the preceding formulas 
for riYX by interchanging X and 7. It should be noted that in 
general rjxr is different from j/rx, although rxr =* ryx = r. We 
shall always consider rj as positive. It is necessarily larger than 





CORRELATION COEFFICIENT AND CORRELATION RATIO 67 


(in exceptional cases equal to) the numerical value of r, for the sum 
of squares of deviations from the means of arrays* will be less than 
the sum of squares of deviations from the line of regression (unless 
the means of arrays lie exactly on a straight line), and consequently 
there is a smaller fraction to be subtracted from 1 in the case of the 
correlation ratio than there is in the case of the correlation coef¬ 
ficient, leaving a larger remainder. In the preceding example we 
found r = 0.57, rj = 0.60. 

26. Relation between correlation coefficient and correlation 
ratio. Fr4chet f has pointed out that the correlation coefficient 
computed from a correlation table can be separated into the 
product of two factors, thus: 

XNxjX - - F) [ZATxCFx - 

’’ [SArx(x-X)2]«[SArx(Fx-F)2]« ■ [S(y-F)*]>^ ^ ^ 

The first factor is the coefficient of correlation obtained by associ¬ 
ating with each value of X the mean of the corresponding values 
of Yf weighted with the number of such F^s; the second is the 
correlation ratio r)Yx- The first factor depends only on the line 
joining the means of columns, approaching 1 if this line approx¬ 
imates a straight line. It is not affected by the dispersion of the 
points in a given column. The correlation ratio, on the contrary, 
is not affected when the partial distributions of the columns are 
displaced by shifting the means, that is, by deforming the curve 
(or broken line) of the means. Thus the first factor does not 
depend on the closeness of the relation between X and Y. On the 
other hand, the value of the correlation ratio may be near 1 either 
because the points corresponding to a given X are near the mean 
of that column or because there is a small number of Y’s for each 
value of X, Both these factors have to be near 1 to yield a value 
of r near 1. 

We shall illustrate the separation of r into the two factors by 
using the data of the correlation table given above. To facilitate 

* Array is a generic term for row or column. 

t Maurice Fr^chet, **Sur le coefficient, dit de correlation et sur la correla¬ 
tion eu general,” Revue de VInatitiU IrUernational de Statiatique^ vol. 4, 1933, 



58 


CORRELATION 


the computation we shall first throw the numerator of the first 
factor into a different form. It can easily be shown that 


mx(X - X)(7x - F) = HXNxTx - jj— (16) 


Using the values obtained when calculating ri, we find that the 
foregoing is equal to 

-2(-2)-l-l + 0-6 + 1-3 + 2-2 - = 12.8 

25 

In the calculations of r and rj respectively we have already found 
ZNx(X - X)2 = 31.04, SNxiTx - F)^ = 5.76 
Consequently 

12 8 

" “ (SLOi X 5 . 76 )^ ® ® ® 


which is the value previously found. 

27. Index of correlation. When a curve F' = f(X) is fitted 
to a set of data we may define the index of correlaMon of F on X 
as the square root of 


Ryx 


= 1 ~ 


S(F - F')2 
S(F - F)2 


(17) 


The index of correlation will be numerically greater than or equal 
to the coefficient of correlation. It can be used in connection with 
ungrouped data or grouped data. The correlation ratio can be 
used only in connection with data which have been grouped into a 
correlation table, or which have more than one value of F corre¬ 
sponding to a given value of X, 

In section 21 we fitted a second-degree polynomial, obtaining 
the curvilinear regression equation 

Y' = 1.3460 + 0.73427X - 0.04497X2 
In that section we also found 

SF = 15, SF2 = 55, 2;XF=:71, SX^F = 457 



MULTIPLE CORRELATION 


59 


Thus 




S(F - F)^ 
zcr - T)^ 


ZY^ - aSF - bzxr - cSZ^F 
^ 2 : f 2 - {ZY)yN 

aZY + hZXY + cZX^Y - (ZY)yN 
SF2 - {ZYY/N 


1,3460 X 15 + 0,73427 X 71 - 0.04497 X 457 - (15)2/5 
55 - (15)2/5 


6,77188 

10 


= 0.677188 


Ryx ~ 0.823 


The coe65lcient of correlation was previously found to be 0.80, 
and we see that the index of correlation is slightly larger. 

28. Multiple correlation. If we have found a linear regression 
equation of Xi on ^ 2 , . .., X*, 

Xi = 6 i + 62^2 + 63^3 + • • • + hkXk 

then by analogy to the simple coefficient of correlation (see equa¬ 
tion (9), section 23, page 49) we define the coefficient of multiple 
correlation between Xi and X 2 ,. . ., X* to be the square root of 

^ SCXx - X[)> 
ri.23..* 1 2 (Zi-Ji)* 

It wiD be remembered (of. (33), section 20 , page 41) that 

S(Xi - Xi)2 =SX?-oiSXi-6i2SXiX 2-buDXiX* (19) 



N 

ZXi 

SX2 

• sx* 


ZXi 

sxf 

2X1X2 • • 

• SXiX* 

= 

ZXz 

SX1X2 

sxi 

• ZX 2 Xk 


sx* 

SXiX* 

SXaXt . • 

• sxi 


N 

SX2 

• • • SXt 


■r 

SX 2 

sxi 

• • • SX2X* 



sx* 

SX2X* 

■. • sxi 



( 20 ) 







60 


CORBELATION 


If we use (19) and the relation S(Xi - » SXf - (ZXiY/N 

we can reduce (18) to the form 

r?.23...» = 

fliSXi + 612SX1X2 + * • • + hik^XxXk “ (2Xi)V-Ar 

SXJ - CLXiY/N 

It can be shown that the coefficient of multiple correlation between 
Xi and X 2 , ... I Xa; is the same as the simple coefficient of correlaticm 
between Xi and Xi, the least squares linear estimate of Xi from 
X 2 , . . ., X,. 

As an example, let us consider the data of section 20 and deter¬ 
mine the coefficient of multiple regression between Y and Xi, X 2 . 
Changing the notation in (21) slightly, we find 

5 ^ boSy + b{lXiY + 62SX2F - {^YY/N 

2F2 _ {;ZY)^/N 

« - 5.9357 X15+1.2194 X 71+1>7484 X32 - (15)V5 
55 - (15)75 
« 0.8491, 0.92 


29. Partial correlation. Suppose that we have k variables 
Xi, . . ., Xfc. If we estimate Xi from the others, excepting X 2 , 
by means of the linear regression equation 

Xi = oi + bisXs + 614X4 + • • • + bi*X* 

and likewise estimate X 2 from the others, omitting Xi, by means 
of the equation 

X2 = 02 + 623X3 + 624X4 + • • • + 62JfcX/b 

then the coefficient of partial correlation between Xi and X 2 is the 
coefficient of correlation between the residuals Xi -- Xj and 
X 2 — xi. It may be regarded as the correlation between Xi and 
X 2 when the effect of the remaining variables on each of them has 
been removed by linear regression equations. 

• Another form is rf.ss-. " 1 — R/Riu where R and Rn are defined in 

section 29. 



PARTIAL CORRELATION 


61 


If we estimate each of the variables Xi and X 2 from all the 
others by means of the equations 

Xi = ai + 612^2 + hizXz + • • • + hikXk 

X2 == 02 + 621^1 + hzsXs + • • • + bzkXk 


the coefficient of partial correlation between Xi and X 2 may also 
be defined as the square root of 

^ 12.3 •••Jb ~ 2>12&21 (22) 


The partial correlation coefficient can be simply expressed by 
means of determinants. Let 



1 

ri2 

ri3 • •• 

Tlk 

R == 

ri2 

1 

r23 • • • 

r2k 


r\k 

T2k 

rzk 

1 


in which r,*,* is the coefficient of correlation between Xi and X,- 
(sometimes called the coefficient of total correlation between Xi 
and Xi). Then * 


_ Ri2 

^ * {RnR22)^ 


(24) 


where Ra is the cofactor of r,-,- in the determinant R, that is, 
(—times the determinant remaining after the zth column and 
jth row have been struck out of R. 

The coefficient ri 2 . 3 ...tis said to be of order k — 2] formula 
(24) expresses the coefficient of order k — 2m terms of the coeffi¬ 
cients r ,7 of order zero. 

We shall compute a coefficient of order one from the data of 
Table 8 . Here we have N — by and replacing Y by X 3 , 

2 X 1 = 18 SX 2 = 13 SX 3 = 15 

2Xf = 110 2Xi = 45 2Xf = 55 

SX 1 X 2 = 25 2 X 1 X 3 = 71 2 X 2 X 3 = 32 

* For a proof of this statement and for a fuller discussion of partial and 
multiple correlation see H. L. Rietz, Mathematical Statistics” (Cams Mon¬ 
ograph 3), Open Court Publishing Co., Chicago, 1926. 




CORRELATION 


ri 2 


SX1X2- s-X'i-sX 2 /i\r 


[SX? - (SXi)VX]«[2Xi - (SX2)VM^ 

25-18 13/5 -21.8 


>■13 


■ [110 - (18)V5]^[45 - (13)V51^ ~ (45.2)«(11.2)« 
= - 0.9689 

71 - 18-15/5 17 


r23 = 


(45.2)^[55 - (15)75]^ (45.2)^(10)^ 

32 - 13 15/5 -7 


=0.7996 


(11.2)^(10)^ (11.2)^ (10)^ 
The determinant (23) is 


= - 0.6614 


R = 


1 

ri 2 

ri3 


ri2 

1 


ri3 

r23 

1 


of which the cofactors needed are 

/ 2 ii 


_ 

1 

^23 

, fll 2 =- 

^23 

R22 = 

[1 

ns 


^23 

1 

ns 

1 


ns 

1 


Formula (24) becomes 

—Ri2 


^ 12-3 


ri2 — ri3r23 


' (RiiR22)^ (1 - - 4 )^ 

-0.9689 - 0.7996(-0.6614) 

[1 - (0.7996)2]^[1 - (-0.6614)2]>< 

-0.4400 -0.4400 

(0.360619 X 0.5625+)^ “ 0.45038 


= - 0.977 


EXERCISES 

1. Find the coefficient of correlation between I.Q. and score in reading 
vocabulary test in Table J, page 43. 

2. Find the coefficient of correlation between number of red blood cells 
and hemoglobin (a) for the men of Table O, (6) for the women of Table O. 
(Adapted from data quoted by Dunn, Physiological ReviewSt vol. 9.) 

8. Table P gives the lengths of right chela (pincer) and of carapace (shell) 
of 470 females of a species of deep-water crab. (Schuster, Biometrika^ vol. 2.) 
(a) Find the coefficient of correlation between chela length and carapace 
length, (b) Find the equations of both lines of regression, (c) Find the 
correlation ratio of chela length on carapace length. 



EXERCISES 


63 


TABLE O 


Number or Red Blood Cells and Amount of Hemoglobin, 
20 Men and 12 Women 


Men 

Women 

Red blood 


Red blood 


Red blood 
cells (mil¬ 
lions per 
100 cc.) 


cells (miL 

Hemoglobin 

cells (mil- 

Hemoglobin 

Hemoglobin 

lions per 

(grams per 

lions per 

(grams per 

(grams per 

cubic 

100 cc.) 

cubic 

100 cc.) 

100 cc.) 

millimeter) 


millimeter) 



4.27 

14.00 

4.93 

16.20 

3.89 

12.12 

4.40 

14.41 

4.97 

15.40 

3.95 

12.10 

4.52 

14.02 

5.00 

16.40 

3.97 

11.90 

4.56 

14.20 

5.02 

15.52 

4.15 

13.20 

4.58 

14.50 

5.15 

16.50 

4.20 

13.10 

4.64 

14.30 

5.20 

15.75 

4.26 

13.60 

4.72 

14.70 

5.36 

16.10 

4.31 

13.40 

4.80 

■pH 

5.49 

16.70 

4.38 

14.80 

4.84 


5.57 

17.17 

4.40 

13.60 

4.89 


5.62 

16.61 

4.45 

13.88 





4.56 

14.00 





4.72 

14.60 


TABLE P 

Correlation Table op Lengths op Right Chela and op Carapace 
IN 470 Crabs 


Length of 


Length of right chela in millimeters 


uarupuiut; 

in milli¬ 
meters 

8-9 

9-10 


11-12 

12-13 

13-14 






9.5-10.0 










ShB 

2 

9.0- 9.5 








3 

4 

■1 

2 

8.5- 9.0 




1 



1 

6 

15 

11 

2 

8.0- 8.5 



1 


2 

1 

15 

61 

25 

3 


7.5- 8.0 



4 

1 

3 

9 

50 

34 

4 

2 


7.0- 7.5 


1 

2 


2 

43 

43 

2 




6.5- 7.0 


1 


1 

23 

28 

3 





6.0- 6.5 



1 

20 

8 







5.0- 6.0 

1 

2 

10 

9 








5.0- 5.5 j 


2 

3 









4.5- 5.0 

1 

6 










4.0- 4.5 

2 


































64 


CORRELATION 


4. (a) Find the coefficient of correlation in Table Q. (b) Find the corre¬ 
lation ratio riYx> (p) a diagram similar to Fig. 8, page 53. 


TABLE Q 

CORRELATIOK TaBLE OF AMOUNT OF NiTROGEK USBD AS FERTILIZER AND 
Yield of Wheat 


Yield of 
wheat in 

Nitrogen applied in pounds per acre, X 

bushels 










per acre. 

0- 

20- 





120- 

140- 

160- 

Y 

20 

40 





140 

160 

180 

32-36 




MM 

15 

10 

4 

6 

2 

28-32 



1 


20 

9 

5 

1 


24-28 


1 

15 

20 

3 





20-24 


2 

12 







16-20 


10 

2 







12-16 


8 








8-12 

4 

4 








4-8 

10 









0-4 

6 










6. (o) From the data of Table N find the coefficient of multiple correlation 
between grade-point average and the other variables. (6) Find the coefficient 
of partial correlation between grade-point average and reading comprehen¬ 
sion. (c) Find the coefficient of partial correlation between grade-point aver¬ 
age and reading rate. 

6. Using the data of Table N, but leaving grade-point average entirely out 
of consideration, find the coefficient of partial correlation between (a) reading 
comprehension and reading rate, (&) intelligence and reading comprehension, 
(c) intelligence and reading rate. 

7. Tables Ri, R 2 , Rs are correlation tables of age and weight, height and 

weight, age and height, of 1138 boys. (Data were selected by Tippett's Ran¬ 
dom Sampling Numbers from more extensive data of Isserlis, Biometrikat 
vol. 11.) (a) Find the coefficient of multiple correlation of weight on height 

and age. Find the coefficient of partial correlation between (5) weight and 
height, (c) weight and age, (d) height and age. 

8. (a) Find the correlation ratio of weight on height in Table Rs. (b) Fit 
a parabola of least squares to the means of columns of Table Rs, weighting 
each mean by the number of individuals in the corresponding column. 
















EXERCISES 


65 


TABLE Ri 


Correlation Table of Ages and Weights op 1138 Boys 



TABLE R2 

Correlation Table of Heights and Weights of 1138 Boys 

























































66 


CORRELATION 


TABLE R| 

CORRELATIOIf TaBLE OF AgES AND HEIGHTS OP 1138 BOTS 
Age in years 


Height 


in inches 

5 6 7 8 9 10 11 12 13 14 


34 2 . 1 

37 7 16 6 5 1 2 

40 22 50 29 9 7 2 1 

43 7 46 61 50 19 7 3 

46 . 12 35 56 58 35 9 8 


49 . 4 22 39 54 47 26 

52 . 1 11 15 33 45 52 

55 . 2 5 18 26 


68 ... 1 .. 2 6 16 


61 . 

64 . 1 



ko O CD kO 


































CHAPTER V 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


30. Binomial distribution. In textbooks on college algebra * 
it is shown that if p is the probability that an event will happen, 
and §( = 1 — p) is the probability that it will fail to happen, in a 
single trial, then the probability that it will happen exactly X 
times in N trials is 

( 1 ) 

Cx being the number of combinations of N things taken Z at a 
time. For example, in tossing a die the probability of throwing 
an ace is K, that is, p = K, g = H- In tossing a die 5 times 
(or 6 dice once) the probability of throwing exactly 3 aces is 


— (-](-) =^“ = 0.0321 
3!2!\6/\6/ 6® 


If a long series of hospital records shows that 40 per cent of 
cases of a certain disease fail to recover, p = 0.4 may be regarded as 
the probability of dying from the disease. Then, just as in the 
dice problem above, the probability that exactly 3 patients out 
of a given set of 5 will die is 

-?4-, (0.4)3(0.6)2 = 0.2304 


The expression (1) is the general term in the binomial expan¬ 
sion of (g + p)^. Thus, the successive terms of this expansion 
give the probabilities that the event will happen not at all, once, 
twice,. . ., iV times, respectively. If we compute all these 

*See, for example, H. L. Rietz and A. R. Crathorne, College Algebra,” 
Henry Holt A Co., New York. 


67 



68 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


probabilities for the binomial (0.6 + 0.4)® we have the results 
shown in Table 12. 

If a new treatment for this disease were being tried out, 5 
patients would be too small a number from 
which to form a judgment as to the efficacy 
of the treatment, but suppose it had been 
administered to 10 patients and all but 1 of 
them recovered. The probability of no 
deaths in a group of 10 patients is 


TABLE 12 

^ Probability op X 
Deaths in 5 Cases 
OF A Disease fob 
Which the Mor¬ 
tality Rate is 40 
Per Cent 


P(0) = (0.6)10 = 0.00605- 
The probability of 1 death is 

P(l) = 10(0.4) (0.6)0 = 0.04031 
The probability of fewer than 2 deaths is 
P(<2) = P(0) + P(l) = 0.04636 

In sets of 10 patients we should then expect 
at most 1 death 4.6 per cent of the time, or 
about once in 20 times, so we should be some¬ 
what doubtful that the treatment is effective. Our doubt will be 
strengthened if we consider the probability of a deviation in either 
direction from the expected value, which is 40 per cent of 10, or 4. 
We have already found that the probability of a deviation of 3 or 
more below the expected value is P(^l) = P(<2) = 0.04636. 
Similarly we can find that the probability of a deviation of 3 or 
more above the expected value is P(^7) = P(>6) = 0.05476. 
Thus the probability of a numerical deviation of 3 or more from 
expectation is P (| Z - 4 | ^3) = 0.04636 + 0.05476 = C.10112. 
That is, as a matter of pure chance we should find a deviation of 3 
or more deaths from the expected number oftener than once in 10 
times. 

31. Normal distribution. As the number of cases grows larger 
the computation of the various probabilities becomes very tedious. 
Fortunately, however, as the number of cases increases the binomial 
distribution approaches the so-called normal distribution 


X 

Probability 

0 

0.07776 

1 

0.25920 

2 

0.34660 

3 

0.23040 

4 

0.07680 

5 

0.01024 


1.00000 




( 2 ) 




NORMAL DISTRIBUTION 


69 


In section 5 was given the transformation that would change 
the normal distribution (2) into the simpler form 

Ydx = v{x)dx, <p{x) = (3) 


Values of <p{x)f that is, the ordinates of the normal curve, are given 
in Table I at the end of this book. 

The function 


= / <e{x)dx = / 

•/_00 OD 


(2ir) 


-^e-^'^dx 


(4) 


gives the probability that a normal variable, with mean zero 
and standard deviation imity, will be less than x. It is the 
area under the normal curve, from the extreme left to the ordinate 
corresponding to the abscissa x. The difference ~ <p^iixi) 

gives the probability that a random variate * will fall between 
xi and X 2 ; it is the area under the curve between x = xi and 

X = X2(X2 > 3;i). 


It has seemed more desirable in this book to deal with the 
probability that a normal deviate will be greater than x. This 
probability will also be found in Table I. If we denote it by 
P{>x)j then we have 


P(>x) = J' <p(x)dx = 1 - (p^i(x)] 


(5) 


Thus, P{>x) is the area under the normal curve to the right of 
the ordinate corresponding to x. Further, by means of (5) it 
can be shown that the probability that x will be between xi and X 2 
is Pi>xi) — P{>X 2 )t which is the area under the curve between 
zi and X 2 (x 2 > xi). This may be expressed by the formula 

(p(x)dx = P(>xi) — P(>X 2 ) (6) 

To illustrate the way in which a binomial distribution 
approaches the corresponding normal distribution, we shall fit a 
normal distribution to the binomial (0.6 + 0.4) which gives, for 



* A varicUe may be defined as a particular value of a variable. 



70 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


example, the probabilities of various numbers of deaths from 0 to 
25, in 25 cases of a disease for which the mortality rate is 40 
per cent. 

It can easily be shown * that for the binomial distribution 
(9 + V)^ the mean number of happenings is Np and that the 
standard deviation of the number of happenings is (Npq),^ It 
can also be shown, by integration, that the mean of the normal 
distribution (2) is p, that its standard deviation is a, and that the 
area imdemeath the curve is unity.! The method of fitting a 
normal distribution to the binomial consists in choosing the mean 
and the standard deviation of the normal curve equal respectively 
to the corresponding parameters of the binomial distribution. 

In the present case we have Np = 25 X 0.4 = 10, which is 
the mean or expected number of deaths in a group of 25 patients, 
also <r = (Npq)^ = (25 X 0.4 X 0.6)^ = 6^ = 2.44949. Conse¬ 
quently the corresponding normal distribution is 


YdX 


sa 


1 

2.45(2ir)>^ 


g-u-m/2x6^X 


( 7 ) 


The two distributions may be compared graphically by drawing the 
curve 


r = 


1 

2.45(2ir)>< 


^-CX-10)*/2X6 


( 8 ) 


and, on the same set of axes, erecting ordinates equal to the bino¬ 
mial probabilities. Although this is unnecessary, it will be done 
here to give a clearer insight into the meaning of the normal 
approximation to the binomial distribution. 


* See Yule and Kendall, An Introduction to the Theory of Statistics,” 
Charles Griffin & Co., Ltd., London, 1937. 

t Another parameter connected with the normal curve is the probable error ; 
or probable deviation. It is defined as that deviation which, taken on each 
side of the mean, will include half of the area under the normal curve. If an 
item is selected at random from a normal distribution, it is equally .probable 
that its absolute deviation from the mean will be greater than one probable 
enoT or less than that value. The probable error may be found by the approx¬ 
imate formula P.E. » 0.67449<r. 



NORMAL DISTRIBUTION 


71 


In order to use available tables we set {X — 10)/6^ = z; 
then (8) assumes the form 


r = 


2.45(27r) 




-**/2 


2.45 


ip(z) 


( 9 ) 


From Table I we can obtain the values necessary for plotting the 
curve. The binomial probabilities, worked out by the formula 


P(Z) = C“(0.4)^(0.6)^®-^ = (0-4)^(0.6)“-^ (10) 

are shown in Table 13. Figure 9 shows the normal curve (9) with 
the probabilities (10) plotted as small circles. 



Fig. 9. —Normal Curve Fitted to a Binomial Distribution. 


To get a normal probability, we must find the area under the 
appropriate part of the normal curve. It must be remembered' 
that the binomial distribution is discrete, in fact it is often called 
the point hinomial. If then, for example, we wish to find the i 
probability of 13 deaths by using the normal approximation we! 
find the area under the normal curve from ~ 12.5toAL2 = 13.5. 
This is given by the definite integral 



1 

6^(2ir)^® 




( 11 ) 


We make the transformation 


X - 10 dX 

Qh - qH- 



72 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


which gives 

12.5 - 10 , 13.5 - 10 , 

xi =--= 1.021, X2 =-^— = 1.429 

and changes the integral (11) into 

1.429 ^1.429 

' (2ir)”^e“**^^dx = / <p(x)dx 

1.021 •' 1.021 

= P(> 1.021) - P(> 1.429) 

= 0.1536 - 0.0765 = 0.0771 


from Table I. The true probability, found by using the binomial 
formula (10), is, to six decimal places, 0.075967. 

One advantage of the normal distribution is illustrated by the 
following examples: The probability of 14 or more deaths is 
given by 

p(> = P(>1429) = 0.0765 

The probability of between 7 and 13 deaths, inclusive, is 

-P{> ^^1^) = P(>-1-429) - P01.429) 

= 1 - 2P(> 1.429) = 1 - 2 X .0765 = 0.8470 
(It can easily be shown that P(> — xi) = 1 — P(>a;i).) 


If these probabilities were calculated by the exact binomial for¬ 
mula it would be necessary to calculate all the individual proba¬ 
bilities included and then to sum them. 

The normal approximation * to the binomial (0.6 + 0.4)2® 
is shown in Table 13. It is seen that the approximation is only 
fairly good. It is better near the expected value .ST == 10 than it 
is in the tails of the distribution. It would have been better for 
larger values of ^^(say N = 100, in which case it would have been 
too laborious to compute the binomial probabilities); it would 
have been worse for p not so close to It is difficult to lay down 


* Obtained by using a seven-place table (Table II of Karl Pearson’s Tables 
fqr Statisticians and Biometricians,” Part 1). 



FITTING A NORMAL DISTRIBUTION TO OBSERVED DATA 73 


fixed rules about when the 
normal distribution may be 
used as an approximation to 
the binomial; this depends 
upon the accuracy of the re¬ 
sults desired, and good judg¬ 
ment in the matter comes only 
with experience.* 

If a better approximation 
is desired we can use a Gram- 
Charlier type A distribution. 
(See section 33.) For extremely 
small values of p combined 
with large values of N the 
Poisson exponential function 
may sometimes be used as 
an approximation, t 

32. Fitting a normal dis¬ 
tribution to observed data. 
The normal distribution is not 
limited in its uses to approxi¬ 
mating the binomial, for ex¬ 
perience has shown that many 
sets of data are distributed 
approximately like the normal 
distribution. Thus, suppose 
that we know that the heights 
of a group of men are normally 
distributed, with mean 68 
inches and standard deviation 


TABLE 13 

Probability op X Deaths in 25 
Cases of a Disease for Which 
THE Mortality Rate is 40 Per 
Cent 


X 

Probability 

Normal ap¬ 
proximation 

0 

0.000 003 

0.000 044 

1 

0.000 047 

0.000 208 

2 

0.000 379 

0.000 840 

3 

0.001 937 

0.002 882 

4 

0.007 104 

0.008 389 

6 

0.019 891 

0.020 726 

6 

0.044 203 

0.043 419 

7 

0.079 986 

0.077 206 

8 

0.119 980 

0.116 415- 

9 

0.151086 

0.149 001 

10 

0.161 158 

0.161 725- 

11 

0.146 507 

0.149 001 

12 

0.113 950+ 

0.116415- 

13 

0.075 967 

0.077 206 

14 

0.043 410 

0.043 419 

15 

0.021222 

0.020 726 

16 

0.008 843 

0.008 389 

17 

0.003 121 

0.002 882 

18 

0.000 925- 

0.000 840 

19 

0.000 227 

0.000 208 

20 

0.000 045+ 

0.000 044 

21 

0.000 007 

0.000 008 

22 

0.000 001 

0.000 001 

23 



24 



25 



Total. . 

0.999 999 

0.999 994 


* For a discussion of this question and of other approximations to the 
^ binomial distribution, see Burton H. Camp, “Probability Integrals for the 
Point Binomial,” Birnnetrikaj vol. 16, 1924, pp. 163-171; and Camp’s book, 
“The Mathematical Part of Elementary Statistics,” D. C. Heath & Co., 
Boston, 1931. 

t See Thornton C. Fry, “Probability and Its Engineering Uses,” D. Van 
Nostrand Co., New York, 1928. 




74 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


2 inches, and wish to find the probability that the height of a man 
taken at random from the group is between 66 inches and 69 
inches. We calculate 


xi 


66 - 68 
2 


1 , X2 


69-68 

2 


= 0.5 


P(>-1) - P(>0.5) = 1 - P(>1) - P(>0.5) 

= 1 - 0.1587 - 0.3085 = 0.5328 


which is the required probability. If 100 men were taken at 
random from such a population, we should expect to find about 
100 X 0.5328, or 53 of them, between 66 and 69 inches tall. 



Fig. 10.—^Normal Curve Fitted to Distribution of Heights of Men. 

Such data as the above may be graduated by means of the 
normal distribution, that is, a normal distribution may be fitted to 
them by making its mean coincide with their mean and its standard 
deviation equal to their standard deviation. We shall illustrate 
the process by fitting a normal curve to the distribution of heights 
of men given in Table 1. 

The mean of this distribution was found to be 67.84 inches; the 
stendard deviation, after Sheppard’s correction had been applied 


FITTING A NORMAL DISTRIBUTION TO OBSERVED DATA 75 


to the variance, was 2.17 inches, 
ing normal curve is 


YdX 


2.17(2ir)^®*^ 


The equation of the correspond- 


(X - 67.84)2 1 
2(2.17)2 J 


dX 


( 12 ) 


where X is to be expressed in inches. The curve is graphed in 
conjunction with the histogram of heights in Fig. 10. In construct¬ 
ing the graph, we must multiply each value of Y by iV( = 346). 

To obtain the theoretical frequencies in the various classes, 
we express the class limits as deviations from the mean, divided by 
the standard deviation. Obviously the theoretical proportional 
frequencies must be multiplied by the total frequency N, The 
work is shown in Tabic 14. For purposes of comparison the 
observed frequencies are shown in the last column. 


TABLE 14 


^ Normal Distribution Fitted to the Frequency Distribution op 
Heights of a Group of Men 


Class 

limit 

X 

X - 67.84 

2.17 

P(>X) 

Difference 

346 X differ¬ 
ence = theo¬ 
retical 
frequency 

Observed 

frequency 

58 

-4.63 

1.0000 







0.0002 

0.1 

1 

60 

-3.61 

0.9998 







0.0034 

1.2 

2 

62 

-2.69 

0.9964 







0.0348 

12.0 

9 

64 

-1.77 

0.9616 







0.1593 

66.1 

48 

66 

-0.86 

0.8023 







0.3302 

114.2 

131 

68 

0.07 

0.4721 







0.3134 

108.4 

102 

70 

1.00 

0.1687 







0.1313 

46.4 

40 

72 

1.92 

0.0274 { 







0.0261 

8.7 

13 

74 

2.84 

0.0023 







0.0022 

0.8 


76 

3.76 

0.0001 




Total. .. 



0.9999 

345.9 

346 • 






76 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


33. Gram-Charlier type A distribution. If a better approxi¬ 
mation to a binomial distribution is desired we can, as suggested 
in section 31, use a Gram-Charlier type A distribution, which also 
provides a better fit than the normal function to an observed 
frequency distribution such as the distribution of heights con¬ 
sidered above. This distribution consists of an expansion in terms 
of the normal function and its derivatives, viz., 

YdX = Yffdx 

= laQ(p(x) + cti<Pi(x) + U 2 ^ 2 (^) + • •]dx (13) 

where <pk(x) is the A;th derivative of fp(x) = and 

and X = (X — ]C)/<ry dx = dX/a. It can be shown that if we 
use X instead of X, that is, if we express our variable as a deviation 
from the mean divided by the standard deviation, then ao = 1, 
ai = 02 = 0, and through the term in ^ 4 (x), which is as far as we 
should often want to carry the expansion, 

YdX = Ycdx = + ^ (^ - 3 ) ni=^)\dx (14) 

The area under any portion of the Gram-Charlier curve 

5 ® '•<*>+s fe - ”*<*>] <“) 

is given by 

JYdX=f Yadx=v-i(x)- ^ (5 - 3 ) (16) 

between the appropriate limits. 

For the binomial distribution we have * 

Ms = Npq, ti 3 = Npqiq - p), #*4 = Npq{l - 6pq) + 3 AP*p* 3 * (17) 
Consequently (16) reduces to 

frdx - M) 

For a further discussion of the Gram-Charlier distribution 

♦Thornton C. Fry, “Probability and Its Engineering Uses,” D. Van 
Nostrand Co., New York, 1928. 



TESTING THE SIGNIFICANCE OF A MEAN 


77 


and examples of fitting observed data and the point binomial by 
means of it, the student is referred to Fry,* Camp,t and Fisher. J 

34. Testing the significance of a mean when the population 
standard deviation is known. If the population from which we 
are taking samples is normal, with mean /x and variance then, 
as has long been well known, the means of samples of size iV” wiU 
themselves be normally distributed, with mean /x and variance 
cr^/JV (i.e., standard deviation o-/iV^).§ This fact enables us to 
use the normal distribution to test the significance of a mean, 
provided we know the standard deviation of the population. 

Suppose, for example, that v/e know from considerable experi¬ 
ence that the average breaking strength of a certain type of cotton 
thread is 7.50 ounces, and that the standard deviation is 1.20 
ounces. A sample of 9 pieces of thread shows a mean breaking 
strength of 6.52 ounces. If we assume that the population of 
breaking strengths is normally distributed we can find the proba¬ 
bility that the consignment from which the sample was taken is 
below standard. The standard deviation of the mean of a sample 



We now calculate 



6.52 ~ 7.50 
0.40 


= - 2.45 


The probability that the mean of a sample of 9 will be this far or 
farther below the population mean is 


^-i(-2.45) = P(>2.45) =i).0071 


That is, in samples of 9, such a deviation below the general aver¬ 
age would be expected only about 7 times in 1000, and we con¬ 
clude that the product is inferior. 

* Thornton C. Fry, “Probability and Its Engineering Uses,** D. Van 
Nostrand Co., New York, 1928. 

t Burton H. Camp, “The Mathematical Part of Elementary Statistics," 
D. C. Heath & Co., Boston, 1931. 

t Arne Fisher, “The Mathematical Theory of Probabilities,** The Mac¬ 
millan Co., New York, 1930. 

§ Even if the population is not normal, the distribution of means will 
usually be approximately normal. 



78 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


It should perhaps be emphasized that significance ” is a 
relative term. Thus, one person might regard a deviation as 
significant if the probability of the occurrence of a greater devia¬ 
tion were 0.05. Another might regard it as significant only if this 
probability were 0.001. It is largely a subjective matter and 
depends upon the chances that the individual is willing to take 
that his judgment may be wrong. Many investigators are willing 
to regard as significant any deviation (or difference) for which the 
probability of a greater deviation is 0.05, and as highly signifimnt 
any deviation for which this probability is 0.01 or less. 

It is conceivable that we might know the population standard 
deviation but not the population mean. For instance, we might 
be prepared to believe that the variability of strength of a certain 
type of thread is about the same although produced by different 
factories, but that the average value differed from factory to 
factory. Suppose that in the above illustration the 9 pieces of 
thread are from a certain factory and that we wish to test the 
h 3 rpothesis that the mean breaking strength of this type of thread 
produced by the factory is not below 6 ounces, on the assumption 
that <r = 1.20 ounces. We calculate 

= PO1.30) = 0.0968 

SO that if the mean breaking strength of the product put out by 
this factory were 6 ounces, we should, in samples of 9, find 
a mean breaking strength of 6.52 ounces or greater nearly once 
in 10 times. The mean breaking strength of the product of this 
factory could very easily be as low as 6 ounces and still yield a 
sample of 9 having a mean breaking strength of 6.52 ounces. 

35. Fiducial or confidence limits. Suppose that, with a known 
population standard deviation of 1.20 ounces, we want to set limits, 
as judged from our sample mean of 6.52 ounces, within which we 
should have some confidence that the population mean lies. If 
we take the probability 





DIFFERENCE BETWEEN TWO MEANS 


79 


and set it equal to 0.02, say, we find * from Table II 

n/ 1 6.52 - M I /0.40 = 2.33, | 6.52 - /* j = 0.93 
H = 6.52 db 0.93, Ml = 5.59, M2 = 7.45 


The value m 2 = 7.45 may be called the 1 per cent fiducial value of 
M for X = 6.52 and a = 1.20. Similarly, mi 5.59 may be 
termed the 99 per cent fiducial value of m corresponding to the 
given values of X and cr. The two values may be called the 98 
per cent fiducial or confidence limits for m corresponding to the 
given X and cr. For suppose that from a sample mean X we 
should assert that the population mean m is between the limits 
r - 2.33<r/iV^ and X + 22Za/N^, that is, 


„ 2.33<r 


< /n < + 


2.33<r 


This inequality is equivalent to 




2.33(r 


< X < M 4 


2.33(7 


But the probability that X will satisfy this last inequality is 0.98, 
SO that in the long run we should be right in our assertion regarding 
the population mean 98 times out of 100 and wrong twice out of 
100 times—once because the observed X fell below m — 2.33/iV’^ 
and once because it was above m + 2.33/iV^, so that the inversion 
of the second inequality above supplied an inequality which failed 
to include m in these instances. 

36. Testing the significance of the difference between two 
means when the population standard deviation is known. If the 

variables Xi and X 2 are normally distributed with means mi and 
M 2 respectively, their difference Xi — X 2 will be normally distrib¬ 
uted with mean mi — M 2 . 

The variance of the difference between two variables is the 
sum of their variances diminished by twice their covariance. In 
symbols this may be stated 


Note that the probability of an absolute or numerical deviation is twice 
that of the corresponding algebraic deviation, and that consequently we find 
in Table II the value of x corresponding to F(> x) * 0.01. 



80 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


In the last term, r is the coefficient of correlation between the 
variables Xi and X 2 , and is their covariance. If Xi and X 2 

are uncorrelated this last term drops out and the foregoing formula 
reduces to 

O’xi-Xa = ^Xi + o'x* (20) 


The variance of the difference between two uncorrelated means 
is thus 


2 2 

- ^2 , ^2 __ ^ 


( 21 ) 


where Ni and N 2 are the respective numbers in the samples from 
which the means Xi and X 2 were calculated. 

It may sometimes happen that it is reasonable to suppose that 
the populations from which the samples have been drawn have 
the same variance In this case, (21) assumes the still simpler 
form 


4i- 





( 22 ) 


For example, expenence might show that the standard deviation 
of the breaking strength of a type of thread such as that considered 
earlier in the chapter could be regarded as 1.20 ounces. Suppose 
that under this assumption a sample of 9 pieces from one factory 
showed a mean strength of 6.52 ounces, while a sample of 16 pieces 
from another factory showed a mean strength of 7.20 ounces. We 
wish to know whether we can regard the general average output of 
the second factory as superior. We can test the hypothesis that 
the two samples were drawn from populations with the same mean, 
that is. Ml = M2, or that the difference Xi — X 2 has the mean 
Ml — M 2 = 0. Here we have 


= 1-20 (^ + = 1.20 X — 0.5 


Xi-X, 


6.52 - 7.20 
0.5 


1.36 


Actually we are concerned only with the absolute value of the dif- 
fe!rence. We find from tables of the normal distribution that the 



DIFFERENCE BETWE]^TWO PROPORTIONS 81 


probability of a deviation greater than 1.36 above or below the 
mean is 

2P01.36) = 2 X 0.0869 = 0.1738 


This would not disprove the hypothesis, and we can not conclude 
that the product of the second factory is, on the average, superior. 

37. Testing the significance of the difference between two 
proportions. If, from a population in which the proportion of 
individuals possessing a certain characteristic is p, we take a large 
number of samples of size Ny then the number X of individuals 
possessing this characteristic might range in these samples from 
0 to Ny but would on the average be Z = Npy and the standard 
deviation of this number would be <tx = (Npq)^y as was stated in 
section 31. The proportion of such individuals might have the 
values 0, 1/iV, 2/Ny . . . y (N — 1)/Ny 1 , but would on the aver¬ 
age have the value p = p, and the standard deviation (tx/N 
^(pq/N)^. 

If we draw repeatedly two samples, one of size Ni and the 
other of size N 2 f from a common population, and if we average 
the obtained values of pi, and likewise those of p 2 , we expect to 
find the average proportions to be pi = p 2 = p, the true popula¬ 
tion proportion, and the standard deviations to be 



respectively. 

Suppose we wish to test the hypothesis that two observed 
proportions, pi and p 2 , obtained from samples of size Ni and N 2 
respectively, are consistent with sampling from a common popu¬ 
lation. On this hypothesis, the expected value of the difference 
Pi P 2 is Pi — P 2 = 0, so that we wish to test whether "the 
observed difference is significantly different from zero. The 
standard deviation of the difference, assuming no correlation 
between the proportions,* is 




Pi-Pi 




* The proportions might, for example, be correlated if the individuals in 
the first sample were brothers of those in the second. 



82 THE BINOMIAL AND NORMAL DISTRIBUTIONS 


Since p = pi = p 2 is unknown, it must be estimated in order to 
evaluate (24), and theoretical considerations show that a good 
estimate based on the samples is the weighted mean of the two 
observed proportions, viz., 

^ N1 + N2 

which is the proportion observed in the combined samples. If the 
numbers Ni and N2 are large, we may assume that the quantity 

__ Pl - P2 

(^' = 1 — p') is normally distributed, and so test the significance 
of the observed difference. 

Suppose that in a group of 33 light-haired persons there are 
26 with blue eyes, while in a group of 27 dark-haired persons there 
are only 9 with blue eyes. (See Table 15.) The proportion of 


TABLE 16 

Hair-Colob and Eye-Color op 60 Persons 



Light-haired 

Dark-haired 

Total 

Blue-eyed. 

26 

9 

35 

Brown-eyed. 

7 

18 

25 

Total. 

33 

27 

60 


blue-eyed persons in the light-haired group is pi = = O.tS, 

while in the dark-haired group it is p 2 = %? = M = 0.3. For 
the total group, p' = = K 2 = 0.583, g' = 1 ~ == 

ii 2 = 0 . 416 . Further, Ni = 33, N 2 = 27, and from (25) we have 

0.7879 - 0.3333 0.4546 „ _ . 

+ 0.1279 ~ 

The probability of a greater numerical difference is 
2P03.55) = 0.0004 







SIGNIFICANCE OF CORRELATION COEFFICIENT 83 


It should be added that the test regarding the difference 
between the proportion of light-haired persons among those having 
blue eyes and among those having brown eyes yields precisely the 
same result. This is always true in a table such as the foregoing. 
We have for this test 

pi = M = 0.7429, P2 = ^ = 0.2800, 

P' = M. 

0.7429 - 0.2800 ^ 0.4629 

* " (M X ~ 0.1303 

38. Testing the significance of a correlation coefficient.* 

The significance of a correlation coefficient r may be tested by 
making the transformation 

X = l log. = 1.1513 l 6 gio (26) 

/ I — r 1 — 7 * 


Ni = 35, N2 = 25 
AT = 60 

= 3.55+ 


Then X will be approximately normally distributed with standard 
deviation l/(N — 3)^, where N is the number of pairs of variates 
from which r was calculated. If p is the coefficient of correlation 
in the population from which the sample was drawn, we can also 
make the transformation 


M 5 log. = 1.1513 logio 

2 1 — p 1 P 


(27) 


and test whether X deviates significantly from p. 

Suppose that the scores in two tests administered to the same 
set of 20 students have a correlation of 0.65. Could we expect 
these same tests in general to yield a correlation of as much as 0.50? 

We test the hypothesis that the population correlation is 0.50. 
That is, we assume that the population correlation is 0.50 and see 
whether the value 0.65 is unusual in such a case. If it is, we should 
reject the hypothesis. We find 


• See R. A. Fisher, “On the ‘probable error’ of a coefficient of correlation 
deduced from a small sample,” Metron, vol. 1,1921, part 4, pp. 1-32. 



84 THE BINOMIAL AND NORMAL DISTRIBUTIONS 





1 + 0.50 
1 - 0.50 


= 0.54930 


1 1 
~ (AT - 3)^ “ (17)^ 

- - - = (0.77530 - 0.54930)(17)^ = 0.9318 

As a normal deviate this is, of course, not significant. 

To test whether this correlation coefficient is significantly dif¬ 
ferent from zero we calculate 


——- = 0.77530(17)^ = 3.197 

The probability of a numerical deviation of this much or more is 
2 X P03.197) = 2 X0.0007 = 0.0014 

which is extremely small, so that the observed correlation is 
decidedly significant. 

An exact method of testing the significance of a correlation 
when the correlation in the population is zero will be given in 
Chapter VI. 

In testing the significance of a partial correlation coefficient we 
proceed as above, except that the standard deviation of X is 


1 

~ (iV-m-3)^ 


(28) 


where m is the number of variables eliminated. Thus in testing 
the significance of a partial coefficient ri 2 . 3 ...* we should use 
a = l/(Ar — A; — 1)^, since A; — 2 variables have been eliminated. 


39. Testing the significance of the difference between two 
correlation coefficients. To test the significance of the difference 
between two correlation coefficients ri and r 2 , calculated from ^ 
samples of iV^i and N 2 respectively, we make the logarithmic 
transformation 


Xi 



1 + ri 


X 2 



1 + r2 
1 — r2 


(29) 



Since 


EXERCISES 


85 


(Ni - 3 )^ ’ (N2 - 3 )^ 

the standard deviation of the difference Xi — X 2 is 

<Tx.-x, = (<ri, + 4,)"^ = (31) 

We therefore set 

+ ,32, 


and regard a; as a normal deviate, with unit standard deviation. 
We test whether x is significantly different from its expected value, 
zero. 

Suppose that a correlation coefficient of ri = 0.60 has been 
obtained from a sample of size 28 and that another of r 2 = 0.40 
has been obtained from a sample of size 23. Are they significantly 
different? 


Xi = 0.69315, X 2 = 0.42365 

0.26950 
(0.09)^ 


We find 

which is not significant. 


= 0.898 


EXERCISES 

1. In the manufacture of a certain article it is known that 2 per cent of the 
articles are defective. What is the probability that a random sample of 10 
of the articles will contain (a) no defectives, (b) exactly 1 defective, (c) exactly 
2 defectives, (d) not more than 2 defectives? 

2. If the probability that a person 65 years old will die within 1 year is 
0.04, find the probability that of 5 persons 65 years old exactly 1 will die dur¬ 
ing the year. 

3. A certain disease has a fatality of 10 per cent. The records of a hospital 
show that, out of 15 patients belonging to a designated occupational class and 
admitted with this disease, 4 died. Does this indicate a lack of resistance to 
the disease on the part of this occupational class? 

4. The presidents of the United States, up to and including Franklin D. 
Roosevelt, have a total of 70 sons and 46 daughters (World Al?nanac). Is this 
an unusual proportion if the ratio of male births to total births in the popula¬ 
tion at large is 0.51? 

6. If a baseball player has a batting average of 0.300, what is the prob¬ 
ability that he will get at least 25 hits out of 100 times at bat? 

6, In a certain university the number of failures in freshman English, over 



86 


THE BINOMIAL AND NORMAL DISTRIBUTIONS 


a period of years, is 8 per cent. In a given year there were 50 failures in a 
class of 500. Can this be attributed to chance? 

7. Supposing that the observer who made the telescopic readings of exer¬ 
cise 1, page 24, is known to have a standard deviation in his readings of 
0''.52, and that the population mean is the same as that of the sample of 15 
readings, find the probability of a deviation as great as or greater than the 
maximum deviation among the 15 readings, under the further assumption of 
a normal population.* 

8. Fit a normal curve to the data of Table B, page 8, finding the theo¬ 
retical frequencies of the various classes. 

9. (a) Fit a normal curve to the total frequencies of carapace length in 

Table P, page 63. (6) Fit a normal curve to the total frequencies of right 

chela length in this table. 

10. (o) Fit a normal curve to the distribution of weights obtained as mar¬ 
ginal totals in either Table Ri or Table R 2 . (h) Fit a normal curve to the 

distribution of heights obtained as marginal totals in either Table R 2 or 
Table Ra. 

/ 11. Suppose that it has been determined that the average pulse rate of 
males in the 20-25 year age group is 72 beats per minute and that the standard 
deviation is 9.5 beats per minute. If a group of 25 distance runners, all in 
the given age group, were examined and found to have an average pulse rate 
of 65, should this be regarded as a significant deviation from the general 
average? 

v/12. Suppose that the standard deviation of stature in men is 2.48 in. One 
hundred male students in a large university are measured and their average 
height is found to be 68,52 in. Determine the 98 per cent confidence limits 
fonthe mean height of the men of the university. 

>/13. If 50 freshmen in a given university are found to have a mean height 
of 68.60 in., and 40 seniors a mean height of 69.51 in., is the evidence conclu¬ 
sive that the mean height of the seniors is greater than that of the freshmen? 

7 ume the standard deviation of height to be 2.48 In. 

14. A certain intelligence test has been administered to a large group of 
pupils and it has been found that the standard deviation of scores is 38.5 for 
girls and 35.2 for boys. The test is given to a group of 17 girls, who make 
an average score of 185.1, and to a group of 23 boys, who make an average 
score of 156.7. Is there a significant difference in intelligence between these 
two groups? 

16. One thousand articles from a factory are examined and found to be 
3 per cent defective. Fifteen hundred similar articles from a second factory 
are found to be only 2 per cent defective. Can it reasonably be concluded 
that the product of the first factory is inferior to that of the second? 

16. In 1919, in the original registration states, the number of white males 
dying between the ages of 30 and 31 was 1609 out of a population of 253,445 ‘ 
of white males in this age group; the corresponding figures for white females of 
the same age were 1302 out of 239,912 United States Life Tables”)* Is 


* See Paul R. Rider, ‘‘Criteria for Rejection of Observations,” Washington 
University Studiei^ new series, Science and Technology, No. 8, St. Louis, 1933. 



EXERCISES 


87 


there a significant difference between the death rates of the two sexes at this 
age? 

17. In 1910, in the original registration states, the number of negro males 
dying between the ages of 30 and 31 was 115 out of 6976. Can it be assumed 
tl^t there is a racial difference in death rates at this age? (See preceding 
exercise for data on white males.) 

18. In 1910, for white males in the age group 30-34 years, the number 
dying in Chicago was 902 out of 106,307; in New York 2130 out of 221,598 
(“United States Life Tables”). Are the death rates in the two cities signifi¬ 
cantly different? 

19. How significant are the results of vaccination shown in Table S? 

TABLE S 


Smallpox Data, London, 1901 
(Macdonell, Biometrikaj vol. 1) 



Recoveries 

Deaths 

Vaccinated. 

662 

108 

Unvaccinated.... 

96 

98 


20. A correlation coefficient of 0.5 is discovered in a sample of 19 pairs. 
Is this significantly different (o) from 0.3, (6) from 0? 

21. Answer the questions of the foregoing exercise for the case in which the 
coefficient is a partial correlation coefficient ri 2 . 346 - 

22. The correlation coefficient between mathematics aptitude and lan¬ 
guage aptitude for a group of 20 boys is 0.42. For a group of 25 girls the 
correlation is 0.75. Is the difference significant? 

23. How probable is it that the correlation in the population from which 
the sample shown in Table J, page 43, was taken is 0.50 or greater? (See 
exercise 1, page 62.) 

24. Is there a significant sex difference in correlation between red blood 
cells and hemoglobin as determined from Table O, page 63? (See exercise 2, 
page 62.) 




CHAPTER VI 


STUDENT’S DISTRIBUTION 

40. Student’s distribution and the reliability of a mean. We 

have seen in the preceding chapter that, if we know the standard 
deviation <r of a normal population, we can test the probability 
that a sample mean ^ deviates by more than a specified amount 
from the population mean /x by treating the quantity — n)/ff 

as a normal deviate with unit standard deviation. If, however, we 
do not know the standard deviation of the population, it becomes 
necessary to estimate it from the sample. We might use the 
standard deviation of the sample,* viz., s = [S(X — Xy/N]^f but 
for certain reasons it is more desirable to use f 

s[N/{N - 1)]« (1) 

Then the estimated standard deviation of the mean, found by 
dividing (1) by is s/(N — 1)^, and if we set 


X — X — n 

s/{N - 1)^ ■" [2(X - Xy/N{N - 1)]^ 


( 2 ) 


*Any quantity such as a standard deviation, a median, a correlation 
coefficient, when calculated from a sample, is called a ataliatie; the correspond¬ 
ing quantity in the population is called a parameter. 

t The expression (1) is the value of a which will make the value of a 
obtained from the sample the most probable. It is obtained by the method * 
which Fisher terms that of mcupimum likelihood and is called the optimum 
value of ff. (See section 50; also Deming and Birge, *'On the statistical 
theory of errors,’’ Graduate School of the U. S. Department of Agriculture, 
Washington, reprinted with additional notes dated 1937 from Renews of 
Modem PhyeieOf vol. 6, 1934, pp. 119-161.) 

88 



STUDENT’S DISTRIBUTION 


89 


then the quantity t is not distributed normally, but in “ Stvdent’s ” 
distribution 


Ydt = 


[(n - 1)/21! 

(n,r)^[(n - 2)/2]l 


(-0 


-(n+l)/2 


dt 


( 3 ) 


with n = — 1, the so-called number of degrees of freedom. 

The symbol A;!, called “ k factorial/* is defined as 

x^e~*dx (4) 

This reduces to A;(A; — 1)(A; — 2) ... 3-2• 1 if A; is an integer. The 
integral (4) is often called the gamma function, T{k + 1), and we 
have the relation A:! = r(A; + 1). 

Any quantity t which is the ratio of a normal deviate to a sto¬ 
chastically independent * estimate of its standard deviation^ obtained 
from samples from a normal population^ is distributed in Student's 
distribution with n equal to the number of degrees of freedom utilized 
in estimating the standard deviation, t 

The variance of t is n/(n — 2), and as n becomes larger the distri¬ 
bution approaches a normal distribution. For n > 30 it is permis¬ 
sible to use tables of the normal distribution by taking t[{n — 2)/n]^ 
as a normal deviate with unit standard deviation, and for n > 100 
the quantity t itself may be considered as the deviate. 

The probability that t will be numerically greater than a 
specified value has been tabulated,! but in practice one generally 
uses tables such as Table V at the end of this book, in which t is 
tabulated in terms of the probability that it will be exceeded in 
random samples. 

In order to gain an idea of the use of Student*s distribution let 
us suppose that a machine which produces mica insulating washers 
for use in electrical devices is set to turn out washers having a 
^ thickness of 10 mils (1 mil = 0.001 inch). A sample of 10 washers 
has an average thickness of 9.52 mils with a standard deviation 

* I.e., independent in a probability sense. 

fSee R. A. Fisher, ^^Applications of ^Student’s’ distribution/’ MetroUt 
vol. 6,1926, No. 3, pp. 90-104. 

t See Metron, vol. 6,1926, No. 3, pp. 106-120. 




90 


STUDENT’S DISTRIBUTION 


of 0.60 mil. Let us see the significance of such a deviation. We find 
by using (2) that 


_ 9.52 - 10 ^ . 

^ ”” 0.60/9^ ~ ‘ ' 


n = 10 — 1 = 9 


The probability that t is numerically greater than 2.4, or that the 
mean thickness of a sample of 10 washers will deviate more than 
2.4 times its estimated standard deviation above or below the 
population mean (here assumed to be 10 mils), is, from the tables 
in Metron just referred to, found to be 


P(| < I > 2.4) = 0.04 


For such a probability we should be inclined to say that the deviar 
tion is not altogether due to chance. In practice we should merely 
note in Table V that 2.4 > 2.262, the 5 per cent level for n = 9. 

41. Confidence limits for the population mean.’^ The fore¬ 
going probability statement can be phrased as follows: If 10 is 
the mean thickness, then the probability of obtaining a sample of 
10 with a mean and a standard deviation leading to a more improb¬ 
able value of t is 0.04. A similar probability statement can be 
made on the basis of the assumption that the mean thickness of 
the population of washers is 9, or 11, or any other value. 

Suppose that we wish to estimate, from our sample, limits 
within which the population mean lies, with some assurance that 
we are correct. For example, suppose that we want to be 98 per 
cent sure (in the sense that we should be correct in our judgment 
98 per cent of the time). We can set P(1 < | ^ h) = 0.02, and for 9 
degrees of freedom find, from Table V, that h = 2.821. That is, 

‘ "" I ^ 

8.9558 10.0842 

* For a clear elementary discussion of the subject of confidence limits and 
fiducial probability the reader is referred to H. L. Rietz, “On a recent advance 
in statistical inference,” American Mathematical Monthly^ vol. 46, 1938, 
pp. 149-158. 



DIFFERENCE BETWEEN TWO MEANS 91 

If fi is less than 8.9558 or greater than 10.0842, then 1 1 1 found 
from our sample values of X and s will exceed 2.821, and therefore, 
if any such value of n is chosen as a hypothetical mean, the sample 
will be regarded as inconsistent at the 2 per cent level of signifi¬ 
cance, even though this value of m may be the true one. On the 
other hand, if any value of for which 8.9558 ^ /x ^ 10.0842 is 
chosen as a hypothetical mean, then for this choice 1 1 ] will be 
less than 2.821 and the sample will be regarded as consistent with 
such a hypothesis at the 2 per cent level of significance. But, 
whatever be the true value of /x, in repeated samples with /x set 
equal to this value, we shall obtain simultaneous values of X and s 
such that 2.821 in 98 per cent of cases, so that whatever be 

the value of /x we expect consistent samples at the 2 per cent level of 
significance in 98 per cent of cases. Therefore, if we suppose our 
sample to be consistent (at the 2 per cent level of significance) with 
regard to the unknown value of /x from which it did arise, then 
in 98 per cent of cases in the long run we shall be correct in our 
supposition. Consequently, in the present instance we sup¬ 
pose our sample to be consistent, which means that we suppose 
8.9558 ^ /X ^ 10.0842; and although we may be wrong in this 
instance, yet 98 per cent of the time that we write such inequalities, 
based on the values of X and s observed, we shall be right; for 98 
per cent of the time we shall have drawn a consistent sample, 
whatever /x. 

The values 8.9558 and 10.0842 may be termed 98 per cent 
fiducial or confidence limits of /x corresponding to the observed 
sample. 

42. Testing the significance of the difference between two 
means. If we wish to determine the significance of the difference 
between two means of small samples we test the hypothesis that 
they came from the same normal population. If two variables 
having variances o-J and respectively are uncorrelated, the 
variance of their difference is + crl. Thus the variance of the 
difference between two means Xi and X 2 of samples of Ni and 
N 2 respectively, taken from a population whose variance is is 
a^/Ni + a^/N 2 = + l/iV 2 ). If we do not know the 



92 


STUDENT^S DISTRIBUTION 


population variance we estimate it from the expression 

,2 „ S(Xi - + 2(X2 - ^ N2s!+N2sI 

* N 1 + N 2 -2 Ni+N 2 -2 


where sf and si are the variances of Xi and X 2 respectively. The 
denominator of the foregoing fraction is the number of degrees of 
freedom used in estimating To obtain this we must deduct 
two from the number of observations, one degree having been 
used up in calculating ^ 1 , another in calculating X 2 . The esti¬ 
mate of the standard deviation of Xi — X 2 would then be 
s'(l/iVi + 1 /N2)^ and we write 


t = 


/ (1. . ly . AY 

* \Ni N 2 J \Ni + N 2 -2/ \N2 Ni) 


( 6 ) 


Then t is distributed in Student ^s distribution with the number of 
degrees of freedom n — Ni + N2 — 2, li Ni ^ N 2 — N, (6) 
reduces to the simpler form 



with n = 2(iV’ — 1). 

As an illustration let us consider the following problem: The 
ash content of coal from two different mines was analyzed, five 
analyses being made of the coal from the first mine, four of that 
from the second mine. Are we justified in supposing that the two 
mines consist of coal with the same percentage ash content on the 
basis of the results obtained, which are recorded in Tables 16A 
and 16B? 

Ni = 5, = 21.6, Ni^i = 30.02 

5 


JVa - 4 , Xa - -V = 18, N 24 - 7.78 
4 

,_ ^1 — ^2 _ 

“ / N 1 + N 2 W ATig? + 

\Ni + JVa - 2/ \ N 1 N 2 / 



SIGNIFICANCE OF A REGRESSION COEFFICIENT 93 


21.5 - 18.0 


^ 9 ^^ ^ 30.02 + 7 . 78 ^^ 
3.5 


- = 2.245+ 

(2.43)^ 

n^Ni + N2-2=-7 
P( It I >2.245) 

TABLE 16A 


0.06 


TABLE 16B 


Coal from First Mine 


Per cent 
ash 

content 

Xi 

Xi-Xi 

(Xi - ro* 

24.3 

2.8 

7.84 

20.8 

-0.7 

0.49 

23.7 

2.2 

4.84 

21.3 

-0.2 

0.04 

17.4 

-4.1 

16.81 

107.6 

0.0 

30.02 


Coal from Second Mine 


Per cent 
ash 

content 

X2 

JCs-Jfj 

(X* - rs)2 

18.2 

0.2 

0.04 

16.9 

-1.1 

1.21 

20.2 

2.2 

4.84 

16.7 

-1.3 

1.69 

72.0 

0.0 

7.78 


Thus, there are about 6 chances in 100 of observing a greater 
difference in percentage ash content, on the supposition that the 
mines are equal in this respect, and at the 5 per cent level of sig¬ 
nificance we can not judge that any difference between the mines 
has been detected. 

43. Testing the significance of a regression coefficient. Sup¬ 
pose that we have fitted a regression line Y' = a + bX or y' — bx 
to a set of N pairs of values of X and Y and wish to see whether the 
regression coefficient b differs significantly from some hypothetical 
value We have previously found that 



4-1 




94 


STUDENT'S DISTRIBUTION 


which is a weighted sum of i/’s, the weight corresponding to yi being 
Wi = Xi/'ZtX^, If now we assume that, for a given value of x, y is 
normally distributed with variance h will be normally distrib¬ 
uted with variance 







( 8 ) 


If we do not know we must estimate it from the data. As an 
estimate we use 


2(7 - F)2 1 

N- 2" ° ¥^2 


the divisor N — 2 being the number of degrees of freedom left 
after the two constants a and h have been calculated. 

If we divide 6 — iS by the estimate of the standard deviation of 
6 we have a quantity distributed in Student^s distribution with 
iV’ — 2 degrees of freedom. The estimated value of is 




r S(F - F')2 ] 
2)^ ^ L(iV - 2)S(X - J)2j 


( 10 ) 


and we set 


x(Y - y')2 

« = (6 - /3) -5- [(jv^ _ 2)s(X - ^)2j ’ n = N -2 (11) 


For the illustration used in section 18 we have 


h = 0.376, 2(7 - 7')2 = 3.6062, 

S(X - = no - AT - 2 = 3 

To test whether b is significantly different from zero we set /3 = 0 in 
(11) and find 


t = 0.376 




3.6062 


3(110 - 324/5) 


^ 0.376 __ 
J “■ 0.1631 “ 


2.31 


This is not significant, as the 5 per cent value of t for 3 degrees of 
freedom is 3.182. 

44. Testing the significance of the difference between two 
regression coefficients. Suppose that we have two regression 
equations 


7'i = ai + 6 iX, 7'2 = ag + 62 X 


(12) 



SIGNIFICANCE OF PARTIAL REGRESSION COEFFICIENT 95 


calculated from Ni and Ns pairs of values respectively. To test 
the significance of the difference between bi and 62 we calculate 


2(^1 - Y'lr + 2 ( 1^2 - Y’2)^ 

Ni + Ns 


.•<13) 


the denominator being the total number of degrees of freedom, 
(Ni — 2 ) + (Ns — 2). The estimated variances of 61 and bs are 
respectively 

Xxf IXf - (SXi^/Ni 
s'* ’ s'2 

2x1 ~ 2X1 - (2XsY/Ns 

r 

and the estimated variance of the difference hi — & 2 , under the 
assumption of no correlation between hi and 62 , is the sum of the 
foregoing. We test the difference hi — 62 by setting 

^ = —rT~\Y’ « = + ( 16 ) 

and continuing as in the preceding sections. 

46. Testing the significance of a partial regression coefficient. 
Suppose that we have fitted a multiple regression equation 

r = 60 + 5 iXi + 62X2 + • • • + hkXk ( 17 ) 


and wish to test the significance of one of the regression coeffi¬ 
cients, say hi. It can be shown, by the method used in the pre¬ 
ceding section, that the estimated variance of hi is where 


_ S(F - FQ 

” i\r - fc - 1 


and is found as in section 20 of Chapter III. The denominator 
in (18) is the number of sets of values used in obtaining the regres¬ 
sion equation diminished by the number of constants in the equa¬ 
tion. It is the number of degrees of freedom used in the estimate 



96 


STUDENT’S DISTRIBUTION 


Then to test the significance of the deviation of hi from a 
hypothetical value jS* (e.g., zero) we set 


t = 


hi-Pi 


(hi — Pi) - 5 - 


Ciil^iY - FQ^ ]^ 

N-k-l J 


(19) 


and use probability tables of the quantity f with n = N — k — 1. 

In the example of multiple regression worked out in Chapter 
III we found 


Coo = 27.9031 

coi 

= - 3.1290 

C 02 - - 6.3226 


Oil 

= 0.3613 

C 12 - 0.7032 




C 22 = 1.4681 

60 = - 5.9357 

61 

= 1.2194 

62 = 1.7484 


We need 

2(7 7')2 272 « 5q27 - 6iSXi7 - 62 SX 27 

= 55 + 5.9357 X 15 - 1.2194 X 71 - 1.7484 X 32 
= 1.5093 


To test whether 61 has a significant value, that is, whether it is 
significantly different from zero, we set 


. „ ( 0.3613 X ums y 


1.2194 

1.6512 


= 0.739 


From Table V we find for n = 2 that the probability that t will 
exceed this value numerically is between 0.5 and 0.6, so the value 
of 5i is not at all significant—we should expect a value this large 
numerically more than half the time. 

46. Comparing two partial regression coefficients. To com¬ 
pare two partial regression coefficients in the same regression 
equation,* such as 61 and 62 , we set t equal to their difference 
divided by the square root of 

s'2(cii - 2 C 12 + C 22 ) (20) 

' • See Fisher, “Statistical Methods for Research Workers,” section 29. 



SAMPLE DRAWN FROM UNCORRELATED MATERIAL 97 


In the present example we have 

^_ ^2 — hi _ 

8'(C11 — 2c12 + C22)^ 

[ 15093 

(0.3613 - 2 X 0.7032+ 1 .4581) I 


0.5290 0.5290 

(0.311670)« “ 0.5583 


0.947, » = 2 


and Table V shows that there is no significant difference, the 
probability of a greater difference being near 0.5. 

47. Testing whether a sample has been drawn from uncor¬ 
related matexial. If we are sampling from uncorrelated material 
the values of r obtained follow the rather simple distribution 


Ydr = 


[{N ^ 3)/2]! 
ir«[(iV - 4)/2]! 


(1 _r2)W-4)/2^^ 


( 21 ) 


where N is the number of pairs of values in the sample. By means 
of the substitution 


r 


tln^ 

(1 + iVn)^^ 


dt 


n = N -2 (22) 


the. distribution (21) is transformed into Student^s distribution (3) 
of section 40 with n = iV — 2. 

To test whether a value of r obtained from a sample of N pairs 
of values is significantly different from zero, that is, whether the 
material from which we are sampling is correlated, we make the 
inverse of transformation (22), viz.. 


t = 


(1 - ' 


n = Ar-2 


(23) 


and use tables of Student’s distribution. 

In section 22 of Chapter IV we found r = 0.80 from 5 pairs of 
values of X and Y, Using formula (23) we find 

3^ X 0.80 1.3856 


t = 


(1 - OM)^ 0.6 


= 2.310, w = 3 



98 


STUDENT’S DISTRIBUTION 


The probability that t is numerically greater than 2.310, which 
is the probability that r deviates from zero by more than ±0.80, 
is about 0.1. The observed value of r therefore could not be 
regarded as significant. The reason why such a comparatively 
high value of the correlation coefficient is not significant is that it 
is obtained from such a small number of cases, that is, the number 
of degrees of freedom is small. 

t 48. Testing the significance of a partial correlation coefficient. 

If we have found a partial correlation coefficient r 12 . 34 ...* from N 
sets of values, the significance of its difference from zero can be 
obtained by using the transformation employed in the preceding 
section, with n = iV" — A;. For example, we found in section 29 a 
value ri 2.3 = — 0.977 from 5 sets of values. In this case iV = 5, 
A; = 3, n = iV’ — A; = 2, and 


2^ (-0.977) 

[1 - (-0.977)2]^ 


= - 6.479, 


The probability that t will be larger than this numerically is 
slightly greater than 0.02 and the corresponding value of r may be 
regarded as significant, although not highly so. 


EXERCISES 

1. A certain stimulus administered to each of 12 patients resulted in the 
following increases of blood pressure: 6, 2, 8, —1, 3, 0, 6, —2, 1, 5, 0, 4. 
Can it be concluded that the stimulus will be in general accompanied by an 
increase in blood pressure? 

2. Using the observations of exercise 1, page 24, set 95 per cent fiducial 
limits for the vertical angular diameter of Venus. 

3. Below are given the gains in weight (pounds) of hogs fed on two different 
diets. Twelve animals were fed on diet A, 15 on diet B. Is either diet 
superior? 

Gains in weight on diet A: 25, 30, 28, 34, 24, 25, 13, 32, 24, 30, 31, 35 
Gains in weight on diet B: 44, 34, 22, 8, 47, 31, 40, 30, 32, 35, 18, 21, 35, 
29,22 

4. Test for significance the regression coefficient found in exercise 2, 
page 43. Can the population regression coefficient be regarded as great 
as 0.5? 

6 . Test for significance the partial regression coefficients found in exercise 8, 
page 45. 

6. A correlation coefficient of 0.42 is found in a sample of 25 pairs of 
values. Can it be regarded as significantly different from zero? 



EXERCISES 


99 


7. If the coefficient in the preceding exercise is a partial correlation coeffi¬ 
cient of order 3 obtained from 25 sets of 5 values each, can it be regarded as 
significant? 

8. (a) Is there a significant sex difference in mean number of blood cells 
as determined from Table O, page 63? (b) Is there a significant sex differ¬ 
ence in mean amount of hemoglobin as determined from this table? 

9. Find the coefficients of regression of hemoglobin on red blood cells for 
the men and for the women of Table O, page 63. Is there a significant 
difference between these coefficients? 



CHAPTER VII 


THE CHI-SQUARE DISTRIBUTION 


49. Chi square. Suppose that the variable X has the distri¬ 
bution (27r)~^ GXi^{—X^/2a^)dX/(Ty in other words that it is 
normally distributed about zero with variance If we have n 
values of X, we define 


X 


2 



( 1 ) 


In words, chi square is the sum of squares of n independent normal 
deviates divided by their common variance. The number of inde¬ 
pendent deviates, n, is called the number of degrees of freedom. 

It can be shown that has the distribution 


and tables of integrals of this function, which is a Pearson type III 
function, have been prepared. They show the probabilities that 
X^ will exceed certain values,* or the values of x^ that will be 
exceeded certain fixed proportions of times, f Table VI at the 
end of the book is of the latter type. 

For large values of n, say n > 30, x (not x^) may be regarded 
as being normally distributed, with mean (n — and standard 
deviation 2”^; that is, the quantity (2x*)^ - (2n - 1)^ may be 
used as a normal deviate with unit standard deviation. 

A very important property of the chi-square distribution is 
that the sum of any number of independent quantities each of which is 

♦Table XII in Karl Pearson's **Tables for Statisticians and Biometri- 
cifins," part 1. 

t Table III in R. A. Fisher's ''Statistical Methods for Research Workers.*’ 

100 



DISTRIBUTION OF VARIANCES 


101 


dislrihvted in a chi-square distrihidion is itself distributed in a chi- 
square distribution with degrees of freedom equal to the mm of the 
degrees of freedom of the separate components, 

60. Distribution of variances and standard deviations. If in 
formula (2) of section 49 we set 


Ns^ N 

x" = —, d(x^) = ~ d(s2), 71 = AT - 1 (3) 

where s^ is the variance of a sample of N individuals, the chi-square 
distribution is transformed into 


2(Ar-i)/2[(j^_3)/2]i^i 


_ (s2)(iN^-3)/2g-i^#V2.*^(52) 


(4) 


which is the distribution of variances of samples of size N from 
normal material. This can readily be changed into the distribu¬ 
tion of standard deviations if desired. 

The above connection between chi square and variance can be 
used in comparing a sample variance with the population variance. 
Suppose, for example, that a sample of 9 individuals has a variance 
of 4.8. How likely is it that the sample came from a population 
having a variance of 3.2? 

We set 


X 


2 = 



9X4.8 

3.2 


13.5, n = AT - 1 = 8 


From Table VI, for 8 degrees of freedom, we find 


P(x2 > 13.362) = 0.10 


Thus, a population having a variance of 3.2 would yield a sample 
having a variance of 4.8 or greater about once out of 10 times. 

From equation (4) can be obtained the value of which will 
make the observed value of s^ the most probable. This value is 
called the optimum value of or^, and the method of obtaining it 
^ is called the method of maximum likelihood. The method consists 
of differentiating (4) with respect to o-^, setting the derivative 
equal to zero, and solving the resulting equation. This gives 




N 

N-1 


s 


2 


(5) 



102 


THE CHI-SQUARE DISTRIBUTION 


For this reason the quantity 


.q'2 = . 


N 


N - 1 


1 

N- 1 


2 (Xi - 


(6) 


is regarded as a better estimate of the population variance than is 
the sample variance itself. Also (6) is the mean value of the 
variance of samples of size N. 

61. Testing the homogeneity of several estimated variances.* 

The chi-square distribution can be used as an approximate test of 
the homogeneity of several estimates of variance. Suppose that 
we have k independent estimates of variance, 

8? = - S(Xl - Ji)2, 8^2 = - S(Z2 - ^2)^ • • • , 
ni 712 

s;2 = - S(X* - Xif (7) 

rih 

based upon ni, n 2 , . . . , n* degrees of freedom respectively. The 
pooled variance is the weighted mean 

s'2 = - 71 = Sn< (8) 

71 

Then the quantity 

~ (n loge s'2 - Sn* log. logi o s'^ - Sn, logi o s[^) (9) 

in which 


is approximately distributed as with k — 1 degrees of freedom, 
and an unduly large value of x^ wUl indicate the presence of dis¬ 
crepancies among the several estimates of variance. (The special 
case in which only two estimates are involved is treated in the ilext 
chapter.) Since C > 1, it need not be calculated if x^ for C = 1 » 
is TiOTirsigTiiJicaTit 

For example, if we have three estimates of variance, 4.2, 6.0, 

*See M. S. Bartlett, ‘‘Properties of sufficiency and statistical tests,” 
Proceedinga of the Royal Society of London^ series A, vol. 160, 1937, pp. 273 ff. 



SAMPLES FROM BINOMIAL AND POISSON DISTRIBUTIONS 103 


and 3.1, based on 4, 5, and 11 degrees of freedom respectively, we 
can form a table such as Table 17, from which we find 

TABLE 17 


i 

s? 

m 

w 

log. 

rii log. Si^ 

1 


. 4 

16.8 

1.43608 

6.74032 



5 

30.0 

1.79176 

8.96880 



11 

34.1 

1.13140 

12.44640 

Total 


20 

80.9 

4.35824 

27.14452 




80.9 


= 4.045 


2)n« 20 

n log. = 20 X 1.39748 = 27.94960 


^ 3(3 - 1) \4 5 11 20/ 


0818 


- (n loge s '2 - 2:n< log. Si^) 


27.94960 - 27.14452 0.80508 


1.0818 


1.0818 


= 0.744 


The probability of a greater value than this, for 2 degrees of 
freedom, is between 0.50 and 0.70, so that the three estimates of 
variance may be regarded as homogeneous. Actually the use of C 
is unnecessary in this example. 

52. Small samples from binomial and Poisson distributions. 

Quite similar to the use of the chi-square test described in section 
50 is its use in testing the homogeneity of small samples of material 
supposed to have come from a binomial or from a Poisson exponen¬ 
tial distribution, by comparing the variance of the sample with the 
expected population variance as estimated from the sample. 

A single member of the binomial distribution (5 + will 
be understood to mean the number of occurrences in N independent 
trials or observations, in each of which the probability of occur- 









104 


THE CHI-SQUARE DISTRIBUTION 


rence is p. A sample of k members is of the form Xi, X 2 ,. . ., 
Xfc, in which each X is an integer between 0 and N inclusive. Then 
the index of dispersion 


- X)2 S(X - X)2 
N\ n) 


( 11 ) 


is distributed approximately as y? with k — 1 degrees of freedom. 
It may be noted that the denominator of (11) is the expected 
value of the population variance as estimated from the sample, 
and that the quantity (11) is comparable to the in (3) of 
section 50. 

Suppose, by way of illustration, that a plant patholc/gist is 
investigating the distribution of a certain plant disease and has 
divided a field into plots, in each of which is a certain number of 
plants. If every plant in the field has an equal and independent 
chance of becoming infected, we expect that the index of dispersion 
will not have an unusual value. If, on the other hand, there is a 
deviation from a random distribution of infection, as for instance 
if the plots on one side of the field show a higher degree of infection, 
we may expect to get an unusual value for the index. 

In an experiment of the type just described, a field was divided 
into 12 plots, each of which contained 90 plants. The numbers 
of infected plants from the various plots were as follows: 19, 6, 9, 
18, 15, 13, 14, 15, 16, 20, 22, 14. Is the infection random? That 
is, does the variability conform to expectation? 

Here we have 


S(X - X)2 = SX2 - = 2953 

fC 


12 


The index of dispersion, as calculated from (11), in which 
N = 90, is 

= 17.76 


The number of degrees of freedom is 11, one less than the number 
df .^^plots. Table VI shows that the probability of obtaining this 



HOMOGENEOUS ESTIMATES OF CORRELATION 106 


value of or a larger value, is between 0.05 and 0.10, which would 
indicate that groups of infected plants do not occur together oftener 
than would happen by chance. 

The index of dispersion for the Poisson exponential distri¬ 
bution is 


S(Z - J)2 
X 


( 12 ) 


The denominator is the e^^ected value of the population variance 
as estimated from the sample, since in the Poisson distribution the 
variance is equal to the mean. 

This index is useful in checking whether the variability in 
bacterial counts made by the dilution method is that which is to 
be expected. According to theory, if the technique of dilution pro¬ 
vides a random distribution of organisms, and if these can develop 
on the plate independently, then the numbers of colonies counted 
on plates from the same dilution are distributed in a Poisson 
exponential distribution. An unusual value of obtained from 
(12) may be taken as an indication that the technique is not good. 

Suppose that the following counts were obtained on 15 plates 
made with the same dilution of a bacterial culture: 193, 168, 
161, 153, 183, 152, 171, 156, 159, 140, 151, 152, 133, 164, 157. Is 
the variability of the counts consistent with that to be expected 
according to the Poisson distribution? 

We calculate the index of dispersion 


X 


2 


2;(X - Xy ^ 3229.73 
X ” 159.53 


20.25- 


For 14 degrees of freedom (that is, one less than the number of 
counts), this is not an unusual value, since Table VI shows that 
thg probability of a greater value is between 0.1 and 0.2. We 
conclude that the variability is consistent with expectation. 

63. Combining homogeneous estimates of correlation. If 
correlation coefficients have been calculated from two or more 
samples extracted from equally correlated populations, then their 
values can be combined to give a better estimate of the population 
correlation. 



106 


THE CHI-SQUARE DISTRIBUTION 


Suppose, for example, that we have found the correlations 
n, r 2 , . . . , r* in A; independent samples of size Ni, N 2 , . . . , Nk, 
respectively. We make the transformations * 

(13) 

Z 1 — Vi 


The several must be weighted inversely according to their vari¬ 
ances. Since the variance of Xi is * l/{Ni — 3), the weighted 
average of the X’s is 


T^jNi - 3)Xi 
- 3) 


(14) 


This may be changed back to a correlation coefficient by the trans¬ 
formation 

^ 1 , 1 + ^ ^'^-1 . . ^ 


Before combining different values of r in this way, however, 
one should test the validity of the underlying assumption that 
the samples from which they were calculated came from equally 
correlated populations. 

The correlation coefficient r< is an estimate of a population 
correlation p< appropriate to the zth sample, and if we make the 
transformations (13) then each Xi is approximately normally 
distributed with mean and variance 


= = 4 = (16) 

respectively. 

The hypothesis to be tested is that 

Pi = p, t = 1 , 2, ..., A; (17) 

in which p is a constant, or that 

X = = M = 5 loge (18) 

(It should be remarked here that the case in which A; » 2 has 
already been treated in section 39.) 


• See section 38. 



HOMOGENEOUS ESTIMATES OF CORRELATION 107 


On this hypothesis we have k independent quantities, 
Xi{i = 1,2, . . . , Aj), distributed in an approximately normal man¬ 
ner with common mean n and with variances given by the last 
part of (16). In a somewhat more general case the variance of 
Xi might be given by (T^/{Ni — 3), so that we are led to the 
general problem of finding an estimate of the variance <7^ from a 
set of numbers which are normally distributed about a common 
mean, and whose variances are known fractions of that variance. 

It is found * that the estimate of may be obtained 
from the equation 


Qc - l)s'2 


~ 3)(x, - xy 


- 3)X,2 


- 3) 


(19) 


in which X is the weighted mean of the Xi as given by (14). 
Then the quantity 

o (k - l)s'2 
X*-^7— 


( 20 ) 


is distributed as with A; — 1 degrees of freedom. Since = 1 
in the present case, our test reduces to testing the significance of 
the quantity (19), calling it because of (20). If this x^ is sig¬ 
nificant, we judge that the variation among the X^s is greater than 
is to be expected at the level of significance chosen, on the hypothe¬ 
sis that m = /i(^^ = 1; 2, . . . , A;). In such a situation we reject 
the hypothesis, which implies the rejection of the hypothesis 
Pi = p. 

If x^ is not significant, we judge that the r< are homogeneous 
and proceed to combine them as explained in the early part of this 
section. It will be noted that the quantities needed in forming 
the combined estimate have already been calculated in making the 
homogeneity test. 

As an illustration, suppose that we have obtained a correlation 
coeflicient of 0.60 from 33 pairs of values, another of 0.52 from 40 

•See F. Yates, “The analysis of multiple classifications with unequal 
numbers in the different classes,'' Journal of the American Statistical Associor 
tion, vol. 29, 1934, p. 66. 



108 


THE CHI-SQUARE DISTRIBUTION 


pairs of values, and a third of 0.44 from 28 pairs of values. The 
calculations necessary for making the homogeneity test are shown 
in Table 18. 


TABLE 18 


i 

r,* 

Xi 

Ni-Z 

1 

09 

(Ni - Z)X? 

1 

0.60 




14.41371 

2 

0.62 


37 

21.32468 

12.29021 

3 

0.44 


26 

11.80676 

6.57603 

Total. ... 



92 

63.92483 

32.27896 


From (19) we find 


92 

= 32,27896 - 31.60747 = 0.671 


For two degrees of freedom this is not an unusual value, so that 
we conclude that the three sample values of the correlation coeffi¬ 
cient may be regarded as coming from equally correlated popu¬ 
lations. 

To combine them we make use of (14), finding 


X = 


53.92483 

92 


0.58614 


From the second part of (15) we get r = 0.53. 

64. Test of goodness of fit. One of the principal uses of the 
chi-square distribution is testing how well an observed frequency 
distribution fits a theoretical distribution, although it affords only 
an approximate test in this instance. 

If the observed frequency in a class of the distribution is‘/o 
and the theoretical or expected frequency is/, then 




( 21 ) 


Th,e number of degrees of freedom is the number of classes less the 













TEST OF GOODNESS OF FIT 


109 


number of constants in which we have forced the theoretical dis¬ 
tribution to agree with the observed. For example, if we have 
made the totals agree and have caused the theoretical distribution 
to have the same mean and the same standard deviation as the 
observed, the number of degrees of freedom is three less than the 
number of classes. If the resulting value of is unusual, say the 
probability is 0.01 or less, we conclude that, if the theoretical dis¬ 
tribution which we have chosen represents the population ade¬ 
quately, then the data which we have observed are unusual—we 
should get a worse fit only once out of a hundred times as a matter 
of chance. In such a situation we are inclined to discard the 
h 3 rpothesis that the theoretical distribution is adequate. If we 
obtain an exceptionally small value of x^> say one such that the 
probability of getting a larger value is 0.99, the fit is too good 
and we suspect that the data are not random. 

As an illustration of the method we shall consider the distribu¬ 
tion of heights and the normal distribution with which it was fitted. 
(See section 32.) The details of the work are shown in Table 19. 

TABLE 19 

COMPUTATIOI^ OP 


Theoretical 

frequency 

/ 

Observed 

frequency 

/o 

/o-/ 

(/.-/)' 

/ 

0.1) 

1 





1.2 

2 


- 1.3 

1.69 

0.1271. 

12.Oj 

9 





66.1 

48 


- 7.1 

60.41 

0.9149 

114.2 

131 


16.8 

282.24 

2.4715 

108.4 

102 


- 6.4 

40.96 

0.3779 

46.4 

40 


- 6.4 

29.16 

0.6423 

8.71 

13 


3.5 

12.25 

1.2895 

0.81 






346.9 

346 

0.1 

416.75 

6.8232 


X® « 6.8232, n = 3 

P(x* > 6.8232) is between 0.10 and 0.20 (Table VI) 




110 


THE CHI-SQUARE DISTRIBUTION 


It will be noticed that the first three classes have been combined 
into a single class. This is because the theoretical frequencies 
in the first two are small, it being an empirical rule that we should 
never use alone a class having in it a theoretical frequency less 
than five. There are six classes, but it will be recalled that we 
made the normal distribution agree with the observed distribution 
in total, mean, and standard deviation. Thus, the number of 
degrees of freedom is 6 — 3 = 3. It is seen from Table VI 
that the probability of a greater value of than that observed, 
viz., 5.8232, is between 0.10 and 0.20. The observed value can 
hardly be regarded as significant, since a larger value would occur 
by chance oftener than once in ten times. 

55. Application to contingency tables. If a set of individuals 
is classified with respect to two or more different attributes and 
the frequencies tabulated we have a contingency table. As a simple 
example of a contingency table let us consider a 2 X 2 table 
(Table 20).* 

TABLE 20 

HaibtColob and Eye-Color (Observed) 


Eye-color 

Hair-color 




lotal 


Light 

Dark 


Blue. 

26 

9 

35 

Brown. 

7 

18 

25 

Total. 

33 

27 

60 


Our problem is to determine whether there is any connection 
between hair-color and eye-color or whether they are independent 
of each other. In the case of a 2 X 2 table such as the one at hand 
this problem is perhaps best considered from the standpoint of the 
difference between two proportions. (See section 37.) However, 
we shall show how the distribution can be used in testing this 
independence, as the method can be used for tables with a greater 
number of compartments. 


*Cf.p.82. 






APPLICATION TO CONTINGENCY TABLES 


111 


If the two attributes were quite independent we should expect 
the 35 blue-eyed persons to be distributed in the same proportion 
as the entire group, that is, we should expect X 35( = 19.25) 

of them to be in the light-haired class and X 35( = 15,75) 

of them to be in the dark-haired group. Similarly we should expect 
the 25 brown-eyed persons to be divided into X 25( = 13.75) 
light-haired and X 25( = 11.25) dark-haired.* Thus we can 
form a table of values expected on the assumption that the two 
attributes are independent (Table 21). 

TABLE 21 


Haih-Color and Eye-Color (Expected) 


Eye-color 

Hair-color 

Total 




Light 

Dark 


Blue. 

19.25 

15.75 

35 

Brown. 

13.75 

11.25 

25 

Total. 

33 

27 

60 


From the table of observed and theoretical frequencies we can 
calculate 


X 


2 


S 


(observed value — expected value) ^ 
expected value 


(26-19.25)2 (9-15.75)2 (7-13.75)2 (18-11.25)2 

19.25 15.75 13.75 11.25 


(6.75)2 (-6.75)2 (6.75)2 

19.25 15.75 13Y5 11.25 


12.6234 


• The number of degrees of freedom is the number of compartments 
of the table which we are free to fill. In this case this is merely one, 

* It should be noted that testing whether the proportions of light-haired 
and dark-haired are the same in each group (that is, blue-eyed and brown¬ 
eyed) is equivalent to testing whether the two groups are samples from the 
same population. 







112 


THE CHI-SQUARE DISTRIBUTION 


because if we fill one of them the others are determined by the 
marginal totals. In general, for a contingency table with r rows 
and c colunms, the number of degrees of freedom is (r — l)(c — 1). 
For n =* 1 we find 

P(x2 > 12.6234) < 0.01 

so that in our example the departure from independence is 
decidedly significant. 

If n = 1 the y} distribution (2) reduces, after division by 2,* to 
(2ir)-^e-*’'2dx 

which is the normal distribution for x« The probability that x* 
is greater than A; is 

P(x2 > fc) = P(| XI > = 2P(x > = 2[1 - (22) 

li must he remembered that this applies only if n = 1. 

In our example we find (12.6234)^ = 3.55+, and 
P(x2 > 12.6234) = 2[1 - ^-i(3.55)] = 2[1 - 0.9981] = 0.00038 

It should be noted that the x^ test applied to a 2 X 2 contin¬ 
gency table always gives the same result as the method of testing 
the difference between two proportions given in section 37. 

66. Contingency tables with small frequencies, f If the 

number in one or more compartments of the table is small, say less 
than 5, certain refinements of the above method yield better results. 
For example, consider Table 22, which shows the results of expo¬ 
sure of 20 people to a certain disease, 7 of the people having been 
inoculated and 13 not. The question to be answered is whether 
the inoculation is effective, that is, whether the frequencies in 
the various compartments differ significantly, on the whole, from 
the values that would be expected if they were distributed in pro- 

* It is necessary to divide by 2, otherwise the area under the normal curve 
from X ** 0 to X ® (corresponding to x* “ 0 and x* * respectirely) 
would be 1 instead of 

t For a good discussion of this topic see F. Yates, “Contingency tables * 
involving small numbers and the x* test,” Supplement to the Journal of the 
Royal Statistical Society^ vol. 1 (1934), pp. 217-236; and J. O. Irwin, “Tests 
of significance for differences between percentages based on small numbers,” 
Metron^ vol. 12, No. 2 (1935), pp. 1-94; also R. A. Fisher, “Statistical Methods 
for Research Workers”, section 21.02. 



CONTINGENCY TABLES WITH SMALL FREQUENCIES 113 


portion to the marginal totals. In such a table we use Yateses 
correction, which consists of adding M to the smallest frequency 

TABLE 22 

Results of Exposure to a Disease 



Attacked 

Not attacked 

Total 

Not inoculated.. 

10 

3 

13 

Inoculated. 

2 

5 

7 

Total. 

12 

8 

20 


of the table and adjusting the others so that the marginal totals 
will remain the same. This is quite comparable to our procedure 
in approximating the point binomial by the normal curve, when, 
in order to find the probability of more than 10 occurrences, we 
use 1 — ^-.i[(10.5 — rather than 1 — ^-i[(10 — ju)/ 

(Npq)^], The numbers 10, 3, 2, 5 in Table 22 would then be 
replaced by 9.5, 3.5, 2.5, 4.5, respectively, and the expected values 
would be 

X 13 = 7.8, iftr X 13 = 5.2, M X 7 = 4.2, ^ X 7 = 2.8 
As before, 

(9.5-7.8)^ (3.5-5.2)^ (2.5-4.2)^ 2 61652 

7.8 5.2 4.2 2.8 

X = 1.627 

P(x*>2.64652) = 2P(x>1.627) 

= 2[l-#._i(L627)] = 2(1-0.94813) = 0.10374 
not a significant value. 

Without Yates’s correction we should have found 

J M+ O - + g - = 4.4322 

7.8 5.2 4.2 2.8 

X = 2.105+ 

P(x® > 4.4322) = 2P(x > 2.105) 

= 2[1 - »._i(2.105)] = 2(1 - 0.98236) = 0.03528 













114 


THE CHI-SQUARE DISTRIBUTION 


It seems appropriate to consider at this point the exact treat¬ 
ment of a 2 X 2 table, which in general may be exhibited as 
Table 23. 


TABLE 23 



A 

Not A 

Total 

B . 

a 

h 

a 4- 6 

Not R. 

c 

d 

c -j- d 

Total. 

a c 

h + d 

N 


It can be shown that when the marginal totals are fixed the 
probability of the observed values a, 6, c, d in a contingency table is 

(a + 6)!(c + d)!(a + c)!(b + d)! 

(a + 6 + c + d)!a!6!c!d! ^ ^ 

If then we wish to find, in the above example (Table 22), the prob¬ 
ability of the observed set of frequencies, or a set more extreme, 
viz.. 



we calculate the sum 

13!7!12!8! 13!7!12!8! 13!7!12!8I 

20!10!3!2!5l'^ 20!11I2!1!6! 20I12!1!0!71 

_ 13I7!12!8! /_1_ 1 1 \ 

20! \10!3!2!6! ^ 1112I1I6! 12I110I7!/ ' 

= (462 + 42 + 1) = 0.0521156 

This exact probability is comparable with P(x > 1.627) =* 
X 0.10374 =s 0.05187, using I'ates's correction, and with 
J?(x > 2.105) * X 0.03528 = 0.01764. It is thus seen that, 










EXERCISES 


115 


in this particular case at least, Yateses correction yields much 
better results; in fact, without it our calculated probability is far 
from correct. 

In conclusion we may say that a 2 X 2 contingency table can 
be dealt with by the method of the difference of two proportions or 
by the method, but if the latter is used and any frequency is 
small, say less than 5, Yateses correction should be made. Better 
still is to employ the exact method. 

For a fuller discussion of testing independence in contingency 
tables the reader should consult the references given earlier in the 
section. 


EXERCISES 

1 . An entomologist sprayed 10 batches of 100 insects each with an insecti¬ 
cide. The numbers killed in the various batches were as foUows: 30, 54, 33, 
60, 63, 48, 54, 63, 54, 51. Does the variance conform to what might be 
expected on the basis of a binomial distribution of mortality? 

2 . The following counts of bacteria from the same culture were made on 
8 plates: 260, 196, 204, 246, 186, 260, 198, 278. Is the variability in con¬ 
formity with what might be expected in samples from a Poisson distribution? 

3. Test the goodness of fit of the normal curve fitted in exercise 8, page 86. 

4 . Test the goodness of fit of the normal curves fitted in exercise 9, page 86. 

6. Test the goodness of fit of the normal, curves fitted in exercise 10, page 

86 . 

6 . Use the distribution to test the significance of the difference in death 
rates in (a) exercise 16, page 86; (6) exercise 17, page 87; (c) exercise 18, 
page 87. 

7. If, in a cross between hybrids, two Mendelian factors are inherited 
independently, that is, if there is no linkage between the two, then the four 
possible combinations should theoretically occur in the proportion 9:3:3:1. 
In an experiment with a species of flower the results shown in Table T were 
noted. (Gregory, Journal of Genetics^ vol. 1.) Are these results consistent 
with this proportion? 


TABLE T 

Results of Crossing Two Hybrids of a Species of Flower 
* (Frequencies of Various Combinations) 


Magenta flower 

Magenta flower 

Red flower 

Red flower 

Green stigma 

Red stigma 

Green stigma 

Red stigma 

120 

1 

48 

36 

1 

13 








116 THE CHI-SQUARE DISTRIBUTION 

8 . Can it be concluded from the data of Table U that bottle-feeding 
conducive to malocclusion of teeth? (Yates, Supplement to the Journal of t 
Royal Staiistical Society^ vol. 1.) 

TABLE U 

Malocclusion op the Teeth in Infants 



Normal Teeth 

Malocclusion 

Breast-fed. 

4 

16 

Bottle-fed. 

1 

21 














CHAPTER VIII 


ANALYSIS OF VARIANCE 


67. Comparing two variances. Let two independent estimates 
of the variance of a normally distributed variable X be 

si* = - S(Zi - X,)*, si* = - 2(X2 - X 2 )* 

ni 712 


which are based upon ni and 712 degrees of freedom respectively. 
Then the probability distribution of 


is known to be * 


w = 


fi: 

4^ 


( 1 ) 


[(ni + 712- 

[(m - 2)/21! [ (712 - 2)/2]! (TliW + 


( 2 ) 


and if we make the transformation- w = we find that the 
distribution of 

1 1 14^, 4 ,o\ 

Z = - lOge W = - loge-7^ = loge “7 (3) 

^ ^ $2 S2 

is Fisher^s distribution f 

2[(ni + 712 - 2)/2]\7iV^\f^h^^^dz 

1(711 - 2)/2]! [(712 - 2)/2]! (7iie2* + 712 )^"^+”*^/^ 

The subscript 1 should always he used with the greater estimated 
variaTice. 

* See, for example, J. O. Irwin, Mathematical theorems involved in the 
anal3iBis of variance,” Journal of the Royal Statistical Society^ vol. 94, 1931, 
pp. 287f. 

• t See Irwin, loc. dt. Also see R. A. Fisher, “On a distribution yielding the 
error functions of several well-known statistics,” Proceedings of the Inter- 
national Mathematical Congress^ Toronto, 1924, pp. 806-813. In this paper 
Fisher shows how the normal distribution, the chi-square distribution, and 
Student’s distribution may be regarded as special cases of his general z dis¬ 
tribution, and summarizes the chief uses of sdl these distributions. 

117 



118 


ANALYSIS OF VARIANCE 


The probability that the ratio (1) will be greater than a speci¬ 
fied value w is found by integrating (2) from it; to oo. It is more 
useful, however, to know the value of w for a fixed probability 
such as 0.01, and values of the ratio w corresponding to probabili¬ 
ties of 0.05 and 0.01 and to various values of ni and n 2 have been 
tabulated by Snedecor.* These values are called respectively the 
5 per cent and 1 per cent values of w. Similarly, values of z 
corresponding to probabilities of 0.05, 0.01, and 0.001 have been 
worked out by Fisher and Deming and have been published in 
Fisher^s book, f These 5 per cent and 1 per cent values of z are 
reproduced, by permission, as Tables VII and VIII respectively 
at the end of this book. In testing variances Sheppard^s correction 
should not he applied. 

For large values of m and n 2 , and for moderate values if they 
are equal or nearly so, 2 is approximately normally distributed 
with standard deviation 



As an illustration, consider two samples composed of 7 and 9 
individuals respectively, and having variances 9.6 and 4.8 respec¬ 
tively. Is the variance 9.6 significantly greater than the vari¬ 
ance 4.8? 

We have 

ni = iVi - 1 = 6, s? = 9.6, si* = = | X 9.6 = 11.2 

iVi — 1 o 

ns = iVa - 1 = 8, si = 4.8, si* = = | x 4.8 = 6.4 

* JN2 ~ 1 o 

11-2 ^ 

w = —r = 2.074 
5.4 

* George W. Snedecor, Statistical Methods Applied to Experiments in 
Agriculture and Biology,'' Collegiate Press, Inc., Ames, Iowa, 1937. These 
tables are also to be found in C. B. Davenport and Merle P. Ekas, ^^Statistical* 
Methods in Biology, Medicine and Psychology” (4th edition), John Wiley & 
Sons, Inc., New York, 1936. In Snedecor’s notation the ratio is denoted 
by F. 

t R. A. Fisher, ^‘Statistical Methods for Research Workers.” These tables 
ere also to be found in Davenport and Ekas, op. cit. 



APPLICATION TO LINEAR REGRESSION 


119 


From Snedecor’s tables we find that the 5 per cent point corre¬ 
sponding to ni = 6 and n 2 = 8 is 3.58 and that the 1 per cent 
point is 6.37. The obtained value is well within the 5 per cent 
point, and the first variance can not be regarded as significantly 
greater than the second, although it is twice as large. 

We could also make the test by setting 

« ~ ^ log* = 1.1513 logio = 0.3648 


From Tables VII and VIII the 5 per cent and 1 per cent values for 
z corresponding to ni = 6, n 2 = 8 are 0.6378 and 0.9259 respec¬ 
tively, and the obtained value is well inside the 5 per cent point. 

Although these two methods are perhaps in most common use, 
there are other ways of comparing two variances. For example, 
the transformation 


w = 


n2U 

ni(l — u) 


niw 

or u = -;- 

niW + 712 


( 6 ) 


wiU carry the distribution (2) into 


[(ni + 712 - 2)/2]! 
[(ni-2)/2J![(n2-2)/2]! 




(7) 


whose integral is the incomplete beta function. From tables of this 
function * the significance of the ratio between two observed vari¬ 
ances can be tested. 

68. Analysis of variance as applied to linear regression. It is 

often possible to separate the variance into the constituent parts 
contributed by various factors, and it is in this possibility that 
the power and usefulness of Fisher^s method known as the aTialysis 
of vqfiaTice lie. Let us, for example, consider the regression line 
fitted in section 18 to the set of points (Z, Y) = (0,1), (1,3), (3,2), 
*(6, 5), (8,4). The equation of this line was found to be F' = 1.646 
+ 0.376Z, and the sum of squares of deviations from it was 
S(F - F')2 = 3.60620. 

* “Tables of the Incomplete Beta-Function,” Biometrika Office, London 



120 


ANALYSIS OF VARIANCE 


Now the sum of squares of deviations from the mean can be 
written 

s(7 - P)2 = 2[(y - n + (F' - P)P 

- s(F - y')2 + 2S(y - y')(y' - P) + s(y' - P)^ (8) 

The middle term is zero, since 

S(y - Y'){Y' - P) = S(y - a - hX){a + bX-7) 

= aS(y - a - bX) + 6SX(y - a - 6X) - PS(y - a - 6X) 

and all three of these terms vanish by reason of the normal equa« 
tions from which the regression line was determined. 
Consequently (8) reduces to 

s(y - P)2 = 2)(y - ry + s(y' - P)^ (o) 


which states that the total variation about the general mean, as 
measured by the sum of squares of deviations from the mean, is 
equal to the variation of the residuals about the regression line plus 
the variation of the regression line about the mean. 

The last term in (9) is usually found as the remainder after 
the variation about the regression line has been subtracted from the 
total variation. In the present example S(y — P)^ = — 

(Sy)VX = 55 - (15)75 = 10, and S(y' - P)^ = 10 - 3.60620 
= 6.39380. However, the term S(y' — P)* can be calculated 
independently as follows: 


S(y' - P)2 = S(a + - P)2 = S[P + 6(X - X) - Pj^ 


= 62S(X - X)2 




^ (SXF - XX-'EY/N)^ _ (71^18X3)^ _ 2^ . Aonocn 
SX2 - (SX)ViV no - 324/5 45.2 


There are five points to which the regression line is fitted, so» 
that the number of degrees of freedom for estimating the total 
variance is four, one having been deducted because we have calcu¬ 
lated the mean from the set of values. Another must be deducted 
for the regression, since another constant, b, has been introduced. 



APPLICATION TO LINEAR REGRESSION 


121 


The results may be placed in an analysis of variance table such as 
Table 24. 

TABLE 24 

Akaltsib of Varian^ce for Linear Regression 



Sum of squares 

Degrees of 

Mean square 


of deviations 

freedom 

deviation 

Residuals. 

3.60620 

3 

1.20207 

Regression. 

6.39380 

1 

6.39380 

Total. 

10 

4 



To test the significance of the linear regression we can use the 
ratio of the two mean square deviations, or half the difference of 
their natural logarithms, 

z = lOog. 6.39380 - log. 1.20207) = 0.83564 

Here ni = 1, n 2 = 3, and for these degrees of freedom the 5 per 
cent and 1 per cent points of z arc 1.1577 and 1.7649 respectively. 
The observed value is within the 5 per cent point, and the regres¬ 
sion is not significant. 

We have just shown that 

s(r' - F)* = 

We know, moreover, that 

S(y- Y'Y = S(y - y'Y = Sy* - h^ry = V - (10) 


The sum of these two expressions is the total sum of squares of 
deviations 2^/^. From the above, and from formula (9), page 49, 
we find that 

, 22/2 - 2t/'2 

' ‘- V V 


Also, since r2 = we see that we may in general 

analyze the sum of squares of deviations as follows: 






122 


ANALYSIS OF VARIANCE 


Degrees of 
freedom 

Residuals = V - = (1 - N-2 


Regression = 


Sx2 




2 ^ 2 , N-1 


Actually, then, testing whether the mean square deviation due 
to regression is significantly greater than the mean square of 
residuals incidentally tests whether the correlation coefl&cient is 
significant. 

The question may be considered from a somewhat more gen¬ 
eral standpoint. Thus, if we wish to test the significance of the 
departure of an observed set of data from a hypothetical popula¬ 
tion regression 7^ = a + (for simplicity we have taken the 
origin of x at the mean), we can analyze the sum of squares of 
deviations from the population regression line into the sum of 
squares of residual deviations about the sample regression line 
Y' ^ a + hx and two other terms as follows: 

2(7 - = 2[(7 - 7') + (7' - 7;)P 

= S[(7 — a — 6x) + (a — a) + (6 — P)x]^ 

= 2(7 - a - hx)^ + N(a - a)^ + (5 - (12) 

it being demonstrable that the cross-product terms vanish on 
summation. The various terms of this equation with their appro¬ 
priate degrees of freedom are shown in Table 25. 

TABLE 25 

Akaltbib of Variance for Linear Regression, Showing Subdivision of 
Regression Sum of Squares 



Residuals. 

Constant term. 

First degree term.... 

2(r -a- 6*)* 
N(o - «)* 

(b - 

N-2 

1 

1 

Totd. 

2(Y - a - 0x)‘ 

N 













APPLICATION TO LINEAR REGRESSION 


123 


To test the significance of the first degree term we consider 
the ratio 


(h - 0)2Sx2 

S(F - a - bz)y{N - 2) 


(13) 


or z, half the natural logarithm of this ratio, and apply the usual 
test, with Til = 1, 712 = AT — 2. 

It will be noted that the square root of (13) is exactly the t of 
equation (11) of section 43 in Chapter VI. In fact, when ni = 1, 
the w or z test and the t test give identical results. 

To test the constant term we use the ratio 


N{a - «)2 

S(F - a - bx)yiN - 2) 


(14) 


or the corresponding z. 

The test is slightly different from the test of significance of the 
difference between a mean a and a hypothetical mean a; it tests 
the significance of the deviation from a of the estimate a of the 
mean value of Y for a given value of x at or near the sample mean. 
For a fuller discussion of this point the reader is referred to Fisher, 
“ Statistical Methods for Research Workers, section 26. 

To test the significance of the deviation of the regression 
observed in the sample from the population regression we use 


lN(a - a)2 + (b- pyZx‘^]/2 
2(7 - a - 5X)2/(V - 2) 


(15) 


or half its natural logarithm. This can not be changed to a < 
test because tii = 2, 7i2 = W — 2, and the t test can be used only 
when m = 1. 

In our illustrative example, 

N(a - a)2 = 5(3 - a)^ 

• (b - ^)22a:2 = (0.376 - ^)245.2 

2(F - a - 6Z)2 = 3.60620 

If a and j8 are zero these quantities assume the values 45, 6.39380 
(not exactly, but this is merely because 0.376 has been carried to 
only three places), and 3.60620, respectively, the last two of which 
are the values of Table 24. 



124 


ANALYSIS OF VARIANCE 


69. Application to curvilinear and multiple regression and 
correlation. The results of the preceding section admit of exten¬ 
sion to regression of higher order. For instance, we may wish to 
see whether we get significantly better results, by fitting a second- 
degree regression curve, over what we get by fitting a straight line. 
Let the two regression equations be 

7' = ai + hiX, Y" = 02 + hiX + CiX^ (16) 
Then it can readily be shown by methods already employed that 
2(7 - F)2 = 2(7 - 7")2 + 2(7" - 7')^ + 2(7' ~ F)2 (17) 

That is, the total variation about the mean may be analyzed into 
three parts: 

(i) the residual variation about the parabola. 

(it) the variation of the parabola about the straight line. 
(in) the variation of the straight line about the mean. 

For the set of five points considered in the preceding section, 
the first- and second-order regression equations were found (see 
sections 18 and 21) to be 

7' = 1.646 + 0.376X, 7" = 1.3460 + 0.73427X - 0.04497X2 

respectively. The sums of squares of deviations from these were 
found to be 2(7 - 7')^ = 3.60620 and 2(7 - 7")2 = 3.22812. 
Then 2(7" - 7')2 = 3.60620 - 3.22812 = 0.37808. 

The analysis of variance table assumes the form below (Table 
26). The remaining degree of freedom is taken up by the variation 
of the mean about the origin. 


TABLE 26 

Analysis of Variance for Parabolic Regression 



Sum of squares of 
deviations 

Degrees of 
freedom 

Residuals. 

S(K - r')* = 3.22812 
Z(Y" - r)* - 0.37808 
2(1" - P)® - 6.39380 

2 

Parabola about line. 

1 

Line about mean. 

1 



Total. 

2(r - P)* - 10.00000 

4 
















CURVILINEAR AND MULTIPLE REGRESSION 


125 


To test the significance of the deviation of the parabola from 
the straight line we take the ratio 


0.37808 

3.22812/2 


0.37808 

1.61406 


= 0.23424 


The degrees of freedom are ni = 1, n 2 = 2. Three possible ways 
of completing the test are now open to us.* We may use Snedecor’s 
tables on w directly, or Tables VII and VIII on 2 = loge 0.23424. 
A third method, which is the one we shall use, is to calculate 
t = = 0.484. For two degrees of freedom this is decidedly 

not significant, since the probability of a value of t this large or 
larger is greater than 0.6. 

To test the significance of the linear part of the regression we 
can compute 


The probability of a value of t numerically this large or larger is 
between 0.2 and 0.1. Although this is not significant, it is less 
probable than the value obtained from i = 0.484. In some cases 
it is possible in this way to show, for example, that the linear 
regression fitted to a set of data is significant while a second-order 
regression is not, that is, that nothing is to be gained in such cases 
by fitting anything but a linear regression. 

To test the complete regression in our illustration we could use 


w 


(6.39380 + 0.37808)/2 
3.22812/2 


= 2.0978 


2 = i loge w = 0.3704 

Here we have ni = 2, n 2 = 2, and we can not use the t test, 
values are not significant. 

«For a multiple regression equation such as 


r = 60 + hiXi + 62X2 + • • • + hkXn 


The 


we can form Table 27. The sums of squares of deviations have 
been expressed in terms of the multiple correlation coefficient f 
(See section 28.) 

* Actually there is no necessity of making the test, since w < 1, 
t For convenience we use R instead of ri .23 • • t. 



126 


ANALYSIS OF VARIANCE 


TABLE 27 

Analysis op Variance for Multiple Regression 



Sum of squares of deviations 

Degrees of 
freedom 

Residuals. 

S(r - Y'f = (1 - B2)2(F - P)* 
S(r' - 7f = B*S(r - P)® 

N-k-l 

Regression. 

k 



Total. 

S(r - P)* = S(r - P)* 

N - 1 



To test the significance of the multiple regression we form the 
ratio 

^ scr - DVfe _ RVk 

2(F - Yy/{N -h-l) (1 - R^)/{N - fc - 1) ^ ’ 

and use Snedecor^s tables with m — 712 = N — k — 1, or set 

2 « H logeiy and use Tables VII and VIII with these same degrees 
of freedom. Incidentally we are testing the significance of the 
multiple correlation coefficient R. 

60. Absolute criteria in the theory of regression.* In the 
preceding section, our estimate of the variance of F, without 
utilizing the regression, is 2(7 — Y)^/{N — 1). Our estimate of 
the variance of Y taking the regression into consideration is 
(1 — R^)'Z(Y — 7y/(N — A; — 1). Consequently if our estima¬ 
tion is to be improved by the use of regression, that is, if our 
estimate is to have a smaller variance, we must have 

(1 - /22)S(y ~ F)2 2(7 - F)2 

N-k-l ^ N-1 

which leads to the inequality «, 

R^ > k/(N - 1) (19) 

To be somewhat more exact, our estimate of the variance of an 
individual forecast without using the regression is our estimate of 


From an unpublished note by Churchill Eisenhart. 















ABSOLUTE CRITERIA IN THE THEORY OF REGRESSION 127 


the variance about the mean plus the variance of the mean itself. 
This estimate is 


S(y - F)2 
N-1 



( 20 ) 


The minimum value of the estimate of the variance of a forecast 
when the regression is used occurs when Xi = Xi, . . . , = X*, 

and is equal to * 


(1 - R^mY - 7)2 / A 
N -k -I \ nJ 


( 21 ) 


If (21) is to be less than (20), then 

1-/^2 1 
NN-I 


which leads to (19) as before, 
the inequality 

J .2 > 


For k = 1, (19) degenerates into 


1 

JY- 1 


( 22 ) 


for the usual correlation coefficient associated with linear regres¬ 
sion. 

By using such inequalities as the above, we can make a definite 
decision regarding the advisability of keeping an additional inde¬ 
pendent variable in our regression. Let Rl denote the square of 
the coefficient of multiple correlation for k independent variables, 
and the square of the coefficient for A; + 1 independent 

variables. If the additional variable is to improve our estimation, 
we must have 

1 - fif+i 1 - fig 
N-k-2 N-k-l 


or. 


1 ~ ^+1 ^ ,_ 1 

1-fig N-k-l 


(23) 


This is probably the most convenient form of the inequality, as it 
gives the necessary smallness of the ratio of the two residual sums 

• Cf. R. A. Fisher, “Statistical Methods for Research Workers," section 26. 



128 


ANALYSIS OF VARIANCE 


of squares if the additional variable is to improve our estimation. 
Thus, if a simple linear regression {k = 1) has been calculated on 
twelve observations, we see that it is useless to consider the addi¬ 
tion of a second independent variable unless by doing so we can 
reduce the residual sum of squares by more than 


1 1 
12 - 1-1 


10 per cent 


It is not intended to suggest that these criteria be used as tests 
of significance, but merely to indicate that there exist such absolute 
criteria—criteria independent of significance level chosen—^and to 
suggest that these considerations be taken into accoimt in experi¬ 
ments in which the use of simple or multiple regression methods 
may be contemplated. 

61. Testing the significance of the correlation ratio. In 
section 25 two different forms for the square of the correlation ratio 
were given, viz., 

Xk Nx Xk ATx 

U* = 1 - XIS (24) 

and 

Xk Xk Nx 

u* = jvx(rx - F) V 2 2 ~ (25) 

x^-xx 


For the meaning of the notation, refer to section 25. If we equate 
these two values and clear of fractions we obtain the fundamental 
identity 

{Yxi - 


Xk Nx 

zz 

X-Xi <-l 


Xk Nx Xk 

= (Fi< — YxY NxiTx — Y) 

X-Xii-l X-Xi 


( 26 ) 


which expresses the fact that in a correlation table the sum of 
squares of deviations of the F’s about their mean is equal to the 
sum of squares of the deviations about the means of columns plus 
the (weighted) sum of squares of the deviations of the means of 



TESTING SIGNIFICANCE OF CORRELATION RATIO 129 


columns about the general mean. This enables us to form the 
analysis of variance table (Table 28). 


TABLE 28 

Analysis of Variance for Correlation Ratio 



Sum of squares of deviations 

Degrees 

of 

freedom 


Xk Nx 

^jfc Nx 


Within columns 



N-k 



jf-xi 1-1 



Xk 

Xk Nx 


Column means 

2 N^CYx - Vf = 

^ £ T.(Yzi-y)^ 

k-l 


X^Xx 

X^Xi 1-1 



Xk Nx 

Xk Nx 


Total. 

X; £(Kx.- - yf = 

X; 'EiYxi-T)^ 

N-1 


X’mXl <-l 

X-^Xi 1-1 





Nx 


X-Xi 


We can calculate the ratio of the mean square deviation of the 
column means to the mean square deviation within the columns, 


TjV(fc - 1) N -k ri^ 
(1 - - ik) ” k-l ■ 1 - 


(27) 


and use either w or log«iy in the usual manner to test the sig¬ 
nificance of an observed value of rj. 

As an illustration we shall consider the value of ij computed in 
section 25 of Chapter IV. There we found 


2zS<(Fx< - Fx)2 = 1 + 2.83 + 4.4 + 2 + 0 = 10.23 
^zNxiTx - F)2 = 6.77, XxMYxi - F)* = 16 
The number of columns was k <= 5, and the total number of items 







130 


ANALYSIS OF VARIANCE 


in the correlation table was N = 25. Consequently the anaJj^sis 
of variance table assumes the form shown in Table 29. 


TABLE 29 



Sum of squares 

Degrees of 

Mean square 


of deviations 

freedom 

deviation 

Within columns. 

HQQjH 

AT - A: = 20 

0.6166 

Column means. 

noon 

k -1 = 4 

1.4426 

Total. 

16 

1 

II 

to 



1 14425 


This is just under the 5 per cent point 0.5265 and can hardly be 
regarded as significant. 

62. Testing linearity of regression. To test linearity of regres¬ 
sion, that is, to test whether is significantly larger than r^, we 
subdivide the sums of squares still further. 

Suppose that in a correlation table we have fitted a regression 
line F' = a + bX, Let be the ordinate of this line corre¬ 
sponding to a fixed value of X, and let Yx be the mean of F for 
this value of X, that is, the mean of the column whose central 
abscissa is X, Then it can be demonstrated that 

E E (Yxi - F)2 =E E 

X i X i 

+E A^x(Fx - Fx)* +E - F)2 (28) 

X X 

In words, the sum of squares of deviations about the mean is 
composed of the sum of squares of deviations about the column 
means, the (weighted) sum of squares of the deviations of4he 
column means from the regression line, and the (weighted) sum of ^ 
squares of deviations of the ordinates of the regression line about 
the general mean. 

The sum of the first two terms on the right of (28) is equal to 

SxSKFzi - Fi)2 











TESTING LINEARITY OF REGRESSION 


131 


which is the sum of squares of deviations about the straight 
line of regression, and which is consequently equal tG 
(1 — r^)I>xNx(7x — F)2. It is not difficult to prove the other 
relations necessary for setting up the analysis of variance tablo 
(Table 30). 

TABLE 30 


Analysis of Variance for Testing Linearity of Regression 



Sum of squares of deviations 

Degrees 

of 

freedom 

Residuals about col- 



umn means. 

Column means about 

2S(rx.-FxF =(i-D‘“)SArx(Fx-?)“ 

X i X 

N-k 

regression line. 

^Nx(Tx-Y'xY = (i=“-r2)2iVx(Px-F)» 

X X 

k-2 

About regression line. 



Subtotal. 

SS(Fx« - F'x)* = (l-r»)SArx(Fx-FF 

X i X 

V -2 

Regression. 

SiVx(ri-FF = r*SiVx{Fx-FF 

X X 

1 

Total. 

2S(rx.-F)2 = 22(rxi-F)>' 

X i X ( 

N - 1 


To test whether rj is significantly greater than r we set 


1 SxiVx(Fx - n)V(A - 2) 

2 2xSi(Fxi - Yx)y{N - k) 



(rj^ - - 2 ) 

(1 - - k) 


ni = fc — 2, m = N — k 


^ and proceed in the usual manner. 
In the example already cited, 


>1* = 0.3604, r* = 0.3299, JV = 25, ifc = 5 (No. of columns) 

_ 1 (0.3604 - 0.3299)/3 1 0.01017 

2 = 2 (1 _ 0.3604)/20 “ 2 0.03198 









132 


ANALYSIS OF VARIANCE 


The numerator of the ratio is smaller than the denominator and 
consequently the difference between r{^ and can not possibly be 
significant. 

63. Variance within and among classes. Suppose that we 
have k classes of individuals with m individuals in each class. We 
may wish to discover whether there is significantly more variation 
among the classes, that is, among the class means, than there is 
within the classes. Let the measurement of the characteristic of 
the ith individual in the jth class be denoted by X,,-, and let the 
mean of the jth class be denoted by Xy. (See Table 31.) Then 


TABLE 31 



Class 


1 

2 

j 

• k 


Xn 

X 12 

••• Xii ■ 

•• Xu 


X 21 

X 22 

• • • X2y • 

* X2k 


Xii 

Xi2 

••• Xii • 

• Xik 


Xml 

Xm2 

• • • Xmj 

* Xmk 

Total.... 

Ti 

T 2 

... Ti ■ 

• Tk 

Mean.... 

Xx 

X 2 

... Xi . 

■■ x„ 


yTi <-i y-i 

It m t m 

= 22 ^ (Xu - XiY + 2 2 ] - X) 

y-i <-i y-i <-i o 

y-1 i-i 

“ i § +m g (r, - r)* 


(30) 






VARIANCE WITHIN AND AMONG CLASSES 


133 


the middle term vanishing on summation. The importance of 
this formula lies in the fact that it separates the sum of squares of 
deviations about the general mean into the sum of squares of 
deviations about the class means and the sum of squares of devia¬ 
tions of the class means about the general mean (multiplied by the 
number in each class, of course). These two components may be 
spoken of as the sum of squares of deviations within classes and 
among classes respectively. The formula is exhibited in tabular 
form in Table 32. The total number of items in the entire group 

TABLE 32 


Analysis of Variance Within and Among Classes 



Sum of squares of 
deviations 

Degrees of 
freedom 

Within classes. 

Among classes. 

k 

m 

k(m — 1) 

k - 1 

Total. 

k m 

mk — 1 



is mk, and the total number of degrees of freedom is mk — 1, 
unity having been deducted because we are calculating deviations 
about the mean. That is, one degree of freedom is taken up by 
the deviation of the mean about the zero point. The number of 
degrees of freedom within each class is similarly m — 1, and since 
there are k classes, the number of degrees of freedom within 
classes is k(m — 1). The number of degrees of freedom among 
classes isk — 1, one less than the number of classes. 

To test whether the variance among classes is significant we use 

^ - X)y(k - 1 ) 

^ - X,Y/k{m, - 1 ) 

or a = log w, with ni = ft — 1, nz = ft(»» — 1). 


(31) 







134 


ANALYSIS OF VARIANCE 


For actual computation it is probably preferable to transform 
the foregoing formulas somewhat. Thus, for the total sum of 
squares of deviations we have 

XiMXii - 

-WS-f (32) 

where N = mk, and T is the grand total. 

For the sum of squares of deviations among the class means 
we find 

^ /SiX«\2 1 ^ 

m ) - (2,-2<X„) 


m N 


(33) 


Tj being the total of the jth class. 

The sum of squares of deviations within classes can be calcu¬ 
lated as a remainder by subtracting the sum of squares of devia¬ 
tions among class means from the total sum of squares of devia¬ 
tions. 

It may be remarked that 


w 


1 + {m 1 )/ 

1 - r' 


(34) 


where r' is the intraclass correlation coefficient,* so that our test will 
incidentally test the significance of this coefficient. 

As an illustration of variance among classes let us consider 
Table 33, which shows the ulna lengths of four strains of hens. 
Measurements of five individuals from each strain are given. 
Actually, measurements would doubtless be made to a greater 
degree of accuracy. These numbers are given to only two sig¬ 
nificant figures for simplicity of illustration. 

* See R. A. Fisher, ^^Statistical Methods for Research Workers.” 



VARIANCE WITHIN AND AMONG CLASSES 


135 


TABLE 33 

Ulna Lengths (in Millimeters) in 4 Strains of Hens 


Strain 



1 

2 

3 

4 


67 

68 

72 

66 


66 

68 

70 

70 


66 

71 

68 

65 


73 

69 

65 

64 


66 

68 

70 

67 

Total. 

338 

334 

345 

332 

Mean. 

1 

67.6 

66.8 

69.0 

66.4 


Grand total = T — 1349 


General mean = X — — 67.46 

20 


Ordinarily the calculation would be performed on a machine.* 
In the present example we shall list the squares of the items in 
Table 34 so that the application of the formulas involved in the 
analysis can be more easily understood. 

TABLE 34 


Squares of Items in Table 33 


4,489 

4,624 

5,184 

4,366 

4,356 

4,624 

4,900 

4,900 

4,356 

5,041 

4,624 

4,226 

6,329 

4,761 

4,225 

4,096 

4,356 

3,364 

4,900 

4,489 

22,886 

22,414 

23,833 

22,066 


Grand total = 91,199 

* It might sometimes be of advantage to express all numbers as deviations 
from an arbitrary origin. For example, if we had a long series of numbers in 
the 700’s, such as 726, 702, 718,. . ., we could write 25, 2,18,..., and, deal¬ 
ing with these deviations, obtain precisely the same results. 






136 


ANALYSIS OF VARIANCE 


From formula (32) we find for the total sum of squares of 
deviations 

91,199 - = 91,199 - 90,990.06 = 208.96 

20 

From (33) the siun of squares of deviations among means of 
strains is 

^[(338)* + (334)* + (345)* + (332)*] - 

= ^(114,244 + 111,556 + 119,025 + 110,224) - — 

= I X 455,049 - 90,990.05 = 19.75 

The sum of squares of deviations within strains is the difference 
208.95 — 19.75 = 189.20, and the analysis of variance table is as 
follows; 

TABLE 36 



Sum of squares 

Degrees of 

Mean square 


of deviations 

freedom 

deviation 

Among strains.... 

19.76 

4-1*3 

6.683 

Within strains.... 

189,20 

4(6 - 1) * 16 

11.826 

Total. 

208.96 

4 X 6 - 1 * 19 



It is seen that there is less variation among strains than within 
strains. 

If there are unequal numbers in the various classes, say iVi in 
the first class, N 2 in the second, and so on, the fundamental for¬ 
mula is 



(x« - r)* 

t n, 


2] V (X« - r,)* + - r)* (36) 

/-I 4-1 













SUBDIVISION OF VARIANCE 


137 


This is exactly comparable to the formula (26) used in connection 
with the correlation ratio. In fact, our classes correspond to the 
columns of the correlation table from which the correlation ratio 
is calculated, and consequently to test whether the variation among 
class means is significantly greater than within classes we proceed 
as before, using the ratio 


SyJVyC^,- - X)y{k - 1) 
- Xi)V{N - k) 


(36) 


or half its natural logarithm. 

64. Subdivision of variance into more than two portions.* 

It often happens that there is a connection between the individual 
items in the different classes. For example, we might have obser¬ 
vations on the first flowering date of four different varieties of 
plants at ten different stations. The observations could be classi¬ 
fied both according to variety and to locality. We should in gen¬ 
eral have a table like Table 31, but should want the means of rows 
as well as columns. (See Table 36.) 


TABLE 36 



1 

2 •• 

• i •• 

• k 

Mean 

1 

Xii 

X 12 •• 

Xu ■■ 

• Xik 

Xi. 

2 

X21 

X 22 •• 

• Xii •• 

• Xu 

Xi. 

i 

Xa 

Xii .. 

• Xii ■■ 

• XiH 

Xi. 

m 

Xml 

Xm 2 •• 

• x„i-- 

• Xfnk 


Mean 

X.i 

m 


m 

X 


• To obtaia a clear insight into the foundations of the analysis of variance 
one should read two papers by J. O. Irwin, '^Mathematical theorems involved 
in the analysis of variance,'* Journal of the Royal Statistical Society^ vol. 94, 
1931, pp. 284r300; and "Independence of the constituent items in the analysis 
of variance," Supplement to the Journal of the Royal Statistical Society^ vol. 1, 
1934, pp. 236-261. 
















138 


ANALYSIS OF VARIANCE 


Let Xi, denote the mean of the 2 th row, X.j the mean of the 
jth column, and X the general mean. The fundamental identity is 

V V - r)=> = fc 2 (J.. - X)» + m 2 

^Ti iTi <-1 ^-1 

t m 

+2 2 ^ ’ 


The last term is called the error or interaction. It has been freed 
of the effect of both rows and columns, and is therefore assumed to 
be due to experimental error only. Ordinarily it is not calculated 
directly, but as the remainder after deducting the other two terms 
from the total sum of squares of deviations. Exhibiting formula 
(37) as an analysis of variance table, we have Table 37. The 


TABLE 37 



Sum of squares of deviations 

Degrees of freedom 

Means of rows. 

- jf)* 

m — 1 

Means of columns.... 

- xf 

k — \ 

Error. 

XiMXii - Xi. - l.i + Xf 

{m - l)(]k - 1) 

Total. 

S, S<(X<y - X)2 

mk — 1 


number of degrees of freedom for rows (or columns) is one less 
than the number of rows (or columns). The number of degrees of 
freedom for error is, as in a contingency table, the number of com¬ 
partments of the table that can be arbitrarily filled, keeping the 
marginal totals or means constant, viz., (m — 1)(A; — 1). 

If our data are a random sample from a homogeneous normal 
population, then each sum of squares of deviations in Table 37, 
when divided by the corresponding number of degrees of freedom, • 
gives an unbiased estimate of the variance of the population. If 
the population is not homogeneous, that is, if it is more variable in 
one way than another, this will show up in the different estimates 
of the population variance; some of them will be greater than 







SUBDIVISION OF VARIANCE 


139 


others. It is usual to compare the columns and rows with the 
error or interaction. 

To test whether there is significant variation in the means of 
rows we use 

_ - X)V(m - 1) _ 

S,Si(X„ - li. - X.i + X)V(w - !)(* - 1) ^ ^ 

m = m — 1, n2 = (w — l)(k — 1) 

and to test the variation of the means of columns we use 

^ - X)y{k - 1) _ 

S,Si(X« - Xi. - x.i + X)7(w - 1)(A: - 1 ) ^ ' 

m = A; — 1, n2 = {m — l)(k — 1) 

Of course, we have the option of using z = ^ log* w in either case. 


■fs (Xii -Xi.-Zi 



Fig. 11.—Analysis of Variance Diagram. 

Formula (37) can be represented geometrically by letting the 
square root of the left side be the diagonal of a rectangular 
parallelopiped of which the edges are the square roots of the three 
terms on the right side, as in Fig. 11. In this figure the S 3 nnbol S is 
used to indicate summation over every individual in the sample; e.g., 

k tn n 

SiZi. - •^)" = 2 2 


As a very simple illustration of formula . (37) let us consider 
Table 38. We find: for total 





140 


ANALYSIS OF VARIANCE 


TABLE 38 




Total 

Mean 


2 

3 

5 

6 

16 

4 


4 

7 

8 

5 

24 

6 


6 

5 

5 

4 

20 

5 

Total. 

12 

15 

18 

15 

60 

15 

Mean. 

4 

5 

6 

5 

20 

5 


4 3 

EE (Xti - = (2 - 5)2 + (3 - 6)2 + (5 - 6)2 + (6 - 5)2 

j-i <-i 

+ (4 - 5)2 + (7 - 6)2 + (8 - 5)2 + (5 - 5)2 
+ (6 - 5)2 + (5 - 5)2 + (5 - 6)2 + (4 - 5)2 


= 30 


For means of rows 

Irf 

= 4[(4 - 5)2 + (6 - 5)2 + (5 - 5)2] = 4(1 + 1 + 0) = 8 


For means of columns 

4 

{I.i - X)2 = 3[(4 - 5)2 + (5 - 5)2 + (6 - 5)2 + (5 - 6)2] 



= 3(1 + 0 + 1 + 0) = 6 
Xi. - X.i + X)2 


= (2 - 4 - 4 + 5)2 + (3 - 4 - 5 + 6)2 + (5 - 4 - 6 + 5)2 

+ (6 - 4 - 5 + 5)2 + (4 - 6 - 4 + 5)2 + (7 - 6 - 5 +.6)2 

+ (8 - 6 - 6 + 5)2 + (5 - 6 - 5 + 5)2 + (6 - 5 - 4 + 5)2 

+ (5 - 6 - 5 + 6)2 + (5 - 6 - 6 + 5)2 + (4 -.5 - 6 + 5)2 

= 16 


. Check: 30 = 8 + 6 + 16 






SUBDIVISION OF VARIANCE 


141 


In actual practice the sums of squares of deviations would not 
be calculated as above. Perhaps the most satisfactory formula 
for use in machine calculation is the following: 

XiMXii - ^ (40) 

in which N = mk^ and T is the grand total. 

In the present example this total sum of squares of deviations 
would be 

22+3®+5*+6*+42+72+8*+52+62+5H5*+42- 

jlZ 

= 330 - 300 = 30 


The sum of squares, 300, and the total, 60, could be run off simul¬ 
taneously on the machine. 

Then for rows we could use the formula 


^ s,rf _ r2 
~ k N 


(41) 


where is the total of the zth row, and T is the grand total. This 
gives 

J[(16)2 + (24)2 + (20)2] _ ^ (00)2 = 308 - 300 = 8 
The formula for columns, analogous to (41), is 

• mSKX.,- - J)2 = - S,(S<Xo)* - ^ (S,S<X«)2 

m iV 

- _ II 

m N 

= i[(12)2+(15)2+(18)2+(15)2]~ ^(60)2 = 306-300 = 6 
Tj is the total of column j. 



142 


ANALYSIS OF VARIANCE 


As has been stated above, the error term would be obtained by 
subtraction (30 — 8 — 6 = 16). 

The analysis of variance table would appear as follows: 

TABLE 39 



Sum of squares 
of deviations 

Degrees of 
freedom 

Mean square 
deviation 

Rows. 

8 

3-1=2 

4 

Columns. 

6 

4-1=3 

2 

Error. 

16 

3X2=6 

2.6 

Total. 

30 

3 X 4 - 1 = 11 



To test the significance of the variation among rows we set 
s = i log. w = i log. ^ = I logo 1-5 = 0-2027 
m = 2, n2 = 6 

Such a value would not be significant. 

The mean square deviation for columns is less than that for 
error and would not need to be tested. 

As was suggested above, the observations might be first 
flowering dates of plants. The columns might then refer to dif¬ 
ferent varieties and the rows to different observation localities. 
Or the columns might refer to different varieties and the rows to 
different years. For rainfall data we might have twelve columns 
for the various months and use the rows for years. Other examples 
could be adduced without limit, and the method is very broadly 
useful. 

The foregoing method of subdivision admits of still further 
extension. Rainfall data, for instance, might be classified accord¬ 
ing to month and year and station at which measured. Or for 
a given station it might be classified according to the hour of the 
day as weU as according to month and year. 

. Suppose then that we have N = klm observed values which are 















SUBDIVISION OF VARIANCE 


143 


subjected to a triple classification into groups, columns, and rows. 
(See Table 40.) Let there be I groups of k rows and m columns 

TABLE 40 



Column 

Mean 

1 

j 

m 


Xm 

Xui 

Xiim 

All. 

Group 1 

Xhii 

Xhu 

Xhlm 

Xhi. 


Xkii 

Xkii 

Xklm 

Xki. 

Maati. 

X.n 

X.ij 

X.lm 

A.i. 


Xm 

Xiii 

Xlim 

Xu. 

Group i 

Xhii 

Xhij 

Xhim 

Xhi. 


Xkii 

Xkii 

Xkim 

Xki. 

Mean. 

X.ii 

x.ii 

y 

X.i. 


Xm 

Xiii 

Xun, 

Xm 

Group 1 

Xhii 

Xhii 

Xhlm 

Xhi. 


Xkii 

Xkii 

Xklm 

Xki. 

Mean. 

X.n 

JZi 

X.lm 

X.i. 


Xi.i 

Xvi 

Xl.m 

Xi.. 

Means of rows- 


Xk.j 

Xh’m 

Xh., 

• 

Xk.i 

^k.i 

Xk.m 

Xk.. 

Mean. 

xZ 

X..i 

X..m 

X 


each. Let Xna be the item in row h and column j of the ith group. 
Let 

Xhv = mean of items in row h of group i. 














144 


ANALYSIS OF VARIANCE 


Xh-i = mean of items in row h and column j, 

X.ii = mean of items in column j of group i, 

Xh., = mean of items in all Ath rows. 

X.i. = mean of items in group i. 

X,.j — mean of items in aU jth columns. 

X = general mean. 

Then the fundamental identity, which we shall state without 
proof, is 

(Xk: - X)* 

y-1 <-l »-l 

I m 

+ fcmj] (R.i. - X)* + fci (X..i - 

iTi fTi 

t i 

+ ^ ^ ^ (Xhi- — Xh.. — x.i, 4 - X)^ 

ftTi ill 

+ (Xh.,’ — Xh.. — X.,j + X)^ 

ft-1 i-1 

iTi jTi 

Jt I m 

+ ^ ^ ^ / “■ X.ij + Xh.. 

ft-1 <-l i-1 

+ X.i. + X..i - X)2 (42) 

The corresponding analysis of variance is shown in Table 41. 

The matter will be much clearer if we work through a concrete 
example. Let us consider Table 42, which shows the first flowering 
date (day of the year) of 5 varieties of plants at 6 different stations 
over a period of 3 years. In the preceding terminology, the plants 
correspond to columns, the stations to groups, and the years to 
rows. In this table totals rather than means are shown. 

The total sum of squares of deviations is found by summing the 
squares of all individual items and then subtracting the square of 



TABLE 41 

Analysis op Variance for Triple Classification (m Columns, { Groups, k Rows) 


SUBDIVISION OF VARIANCE 


145 





ei 

IX 

+ 


04 

iR 




iR 

IN 

I 




+ 

+ 





iR 

I 

IN 

IN 

iR iR 
i_ + 

C4 

ei 


IN 

I 

IN 

I 

IN 

iR iR 
+ 

iR 

iR 

iR 

I 

I 

I 

I 

1 

4 

r , 

I 

H 

I 

4 

4 

4 

b. iR 

-UT 












146 


ANALYSIS OF VARIANCE 


TABLE 42 

First Flowering Date (Day of Year) of 6 Varieties op Plants 
AT 6 Stations Over a Period of 3 Years 



Hazel 

Colts¬ 

foot 

Anem¬ 

one 

Black¬ 

thorn 

Mustard 

Total 

Broadchalke— 







1932 

57 

67 

^5 

102 

123 

444 

1933 

46 

72 

90 

88 

101 

397 

1934 

28 

66 

89 

109 

113 

405 


— 

— 

— 

— 

— 

— 

Total. 

131 

205 

274 

299 

377 

1246 

Bratton— 







1932 

26 

44 

92 

96 

93 

351 

1933 

38 

68 

89 

89 

no 

394 

1934 

20 

64 

106 

106 

115 

400 


— 

— 

— 

— 

— 

— 

Total. 

84 

176 

291 

291 

318 

1145 

Lenham— 







1932 

48 

61 

78 

99 

113 

339 

1933 

35 

60 

89 

87 

109 

380 

1934 

48 

75 

95 

113 

111 

442 


— 

— 

— 

— 

— 

— 

Total. 

131 

196 

262 

299 

333 

1221 

Dorstone— 







1932 

50 

68 

85 

117 

124 

444 

1933 

37 

65 

74 

93 

102 

371 

1934 

19 

61 

80 

107 

118 

385 


— 

— 

— 

— 

— 

— 

Total. 

106 

194 

239 

317 

344 

1200 

Coaley— 







1932 

23 

74 

105 

103 

120 

425 

1933 

36 

47 

85 

90 

101 

359 

1934 

18 

69 

85 

105 

111 

388 


— 

— 

— 

— 

— 

— 

Total. 

77 

190 

275 

298 

332 

1172 

Ipswich— 

1932 

39 

57 

91 

102 

112 

401 

1933 

39 

61 

82 

93 

104 

379 

1934 

43 

61 

98 

98 

112 

412 


— 

— 

— 

— 

— 

— 

Total. 

121 

179 

271 

293 

328 

1192 

* 

All stations— 







1932 

243 

371 

546 

619 

685 

2464 

1933 

231 

373 

509 

540 

627 

2280 

1934 

176 

396 

542 

638 

680 

2432 


— 

— 

— 

— 

-- 

— 

Total. 

650 

1140 

1597 

1797 

1992 

7176 





































SUBDIVISION OF VARIANCE 


147 


the grand total divided by the number of items (5X6X3 = 90). 
This can readily be done on a calculating machine. We find 


644,074 


( 57)2 + ( 07)2 + 

51,494,976 


•+ ( 112)2 


(7176)2 

90 


90 


= 644,074 - 572,166.4 = 71,907.6 


For plants and stations we have Table 42A. From this table 


TABLE 42A 
Plants and Stations 



Hazel 

Colts¬ 

foot 

Anem¬ 

one 

Black¬ 

thorn 

Mustard 

Total 

Broadchalke. 

131 

205 

274 

299 

337 

1246 

Bratton. 

84 

176 

276 

291 

318 

1145 

Lenham. 

131 

196 

262 

299 

333 

1221 

Dorstone. 

106 

194 

239 

317 

344 

1200 

Coaley. 

777 

190 

275 

298 

332 

1172 

Ipswich. 

121 

179 

271 

293 

328 

1192 

Total. 

660 

1140 

1597 

1797 

1992 

7176 


we calculate as we did for the double classification table of sec¬ 
tion 64, 

|[(131)2 + (205)2 + ..(328)2] _ ^ (7170)2 

= ^ X 1,916,812 - 572,l6b.4 = 66,770.93 

The reason that we take one-third of the sum of squares is that 
each value such as 131 is the total of three items (flowering dates 
for 1932, 1933, 1934). The value 66,770.93 is the subtotal sum of 
sqimres of deviations for plants and stations. 

The sum of squares of deviations for plants is 

A [(650)2 4- (1140)2 +... -f (1992)2] - (7176)2 

= ^ X 11,469,782 - 572,166.4 = 65,043.7i 

Here each value such as 650 is the total of 18 items (flowering 










148 


ANALYSIS OF VARIANCE 


dates for 3 years at 6 stations), and we must divide the sum of 
squares by 18. 

The sum of squares of deviations for stations is 

^ [1246)2 + (1145)2 +... + (1192)2] - ^ (7176)2 
« ^ X 8,588,830 - 572,166.4 =» 422.26 
The interaction term for plants and stations is the remainder, 
66,770.96 - (65,043.7i + 422.26) = 1304.95 
For plants and years we have Table 42B. 


TABLE 42B 
Plants and Years 



Hazel 

Coltsfoot 

Anemone 

Black¬ 

thorn 

Mustard 

Total 

1932 

243 

371 

546 

619 

685 

2464 

1933 

231 

373 

509 

540 

627 

2280 

1934 

176 

396 

542 

628 

680 

2432 

Total... 

650 

1140 

1597 

1797 

1992 

7176 


The subtotal sum of squares of deviations is 

|[(243)* + (37iy + •. • + (680)*] - (7176)* 

= I X 3,834,492 - 572,166.4 = 66,915.6 

The sum of squares of deviations for years is 

^[(2464)* + (2280)* + (2432)*] - -,^(7176)* 

“ ^ X 17,184,320 - 572,166.4 = 644.26 

The sum of squares of deviations for plants has already been 
foimdto be 65,043.7i. Consequently the interaction term for 
plants and years is 

66,915.6 - (644.26 + 65,043.7i) - 1227.62 






















SUBDIVISION OF VARIANCE 


149 


The table for stations and years is Table 42C. 

TABLE 42C 
Stations and Ybabs 



Broad- 

chalke 

Bratton 

Lenham 

Dorstone 

Coaley 

Ipswich 

Total 

1932 

444 

351 

399 

444 

425 

401 

2464 

1933 

397 

394 

380 


359 

379 

2280 

1934 

405 

400 

442 


388 

412 

2432 

Total. 

1246 

1145 

1221 

1200 

1172 

1192 

7176 


Since we have the sums of squares of deviations for stations and 
for years (422.26 and 644.26 respectively) we need only to find 
the subtotal sum of squares of deviations. This is 

^[(444)2 + (351)2 (412)2] ^ ^ ( 7175)2 

= ^ X 2,873,410 - 572,166.4 = 2515.6 

The interaction between stations and years is 

2515.6 - (422.26 + 644.26) = 1449.06 

We can now find the sum of squares of deviations for the 
interaction among plants, stations, and years by subtracting from 
the total the terms for plants, stations, and years, and the single 
interaction tenns, plants X stations, plants X years, and stations 
X years. This gives 

^ 71,907.6 - (65,043.7i + 422.26 + 644.26 + 1304.95 
+ I 227 . 2 S + 1449 . 06 ) = 1815.91 

We are now ready to form the analysis of variance table 
(Table 42D). 











150 


ANALYSIS OF VARIANCE 


TABLE 42D 


Analysis op Variance 



Sum of squares 
of deviations 

Degrees of 
freedom 

Mean square 
deviation 

Plants. 

65,043.71 

6-1*= 4 

16,260.927 

Stations. 

422.26 

6-1= 6 

84.456 

Years. 

644.26 

3-1= 2 

322.136 

Plants X stations. 

1,304.96 

4X6=20 

66.247 

Plants X years. 

1,227.2i 

4X2= 8 

153.4027 

Stations X years. 

1,449.06 


144.906 

Plants X stations X years. 

1,815.9i 


45.397 

Total. 

71,907.6 

5X6X3-1=89 



If we wish to test any variance, that is, any mean square devia¬ 
tion, we use the interaction of plants, stations, and years for the 
comparison term. For example, if we wish to test the variation 
from year to year we take 

, , 322.13 

2 = 5 log. = 0.9797, m = 2, 712 = 40 

This value is outside the 5 per cent point and is very close to the 
1 per cent point. 

In double or multiple classification the case of unequal frequen¬ 
cies in the various classes presents some difficulties. Those inter¬ 
ested in this case are referred to the original papers in which it has 
been treated.* 

65. Analysis of covariance, f The covariance between N 
pairs of values of X and Y has been defined (see section 18) as 
'Zxy/N, where x^X — Xjy=Y — Y. It is quite possible to 

""See F. Yates, ^‘The analysis of multiple classifications with unequal 
numbers in the different classes,” Journal of the American Statistical Assocta- 
tioUf vol. 29, 1934, pp. 51-66; and the references contained therein. 

t For a fuller treatment of this topic consult the following references: 

R. A. Fisher, Statistical Methods for Research Workers,” section 49.1. < 

J. Wishart and H. G. Sanders, “Principles and Practice of Field Experi¬ 
mentation,” Empire Cotton Growing Corporation, London, 1936. 

J. Wishart, “Tests of significance in analysis of covariance,” Supplement 
to the Journal of the Royal Statistical Society^ vol. 3,1936, pp. 79-82, and further 
references contained therein. 
















ANALYSIS OF COVARIANCE 


151 


analyze the sum of products into components just as we analyzed 
the sum of squares. In this way we are able to obtain estimates of 
the covariance, and also of the regression and correlation coeffi¬ 
cients, which are freed from class or other effects. The analysis of 
covariance has been used in agricultural and biological problems by 
Fisher and others. Bailey * has given a number of miscellaneous 
examples of its use and has emphasized its applicability to time 
series in economics. 

Suppose that we have a table (Table 43) similar to Table 31 


TABLE 43 


Class 



(Xii,yii) ... (Xuk, Fu) 

{Xmlt Fml) *** (Xmfc» Ymk) 

Mean 

(Xi,7i) ••• (Xt, Ft) 


but with two variables instead of one. That is, we have h classes 
with m individuals in each. With the zth individual in the jth 
class is associated a pair of values 7*,-, which may be regarded 
as the measures of a certain pair of attributes or characteristics. 
If Xj, Yj are the means of X and Y respectively in the jth class, 
then the sum of products Yixy can be written 


2]V(x,,-X)(r«-F) 

<-l 

t m 

“ ^ 2 [(x« - X,) + (Xi - X)][(y« - 7i) + (Yi - F)1 

km km 

(X„-5’,)(r«-F,) + ZE (F,- - X)(F« - F,) 

>-l <-l i-1 t-1 

■ (F,- - F)(X« - ^i) + tt: 






*A. L. Bailey, “The analysis of covariance,” Journal of the American 
Statisticdl Association, vol. 26, 1931, pp. 424-436. 











152 


ANALYSIS OF VARIANCE 


t m 

F.) 

4-1 <-i 
t 

+ m53(X,-^)(F,-P) (43) 

The term S,St(X,* — ^)(Vij — Tj) vanishes, since it is equal to 

t m k 

V (X,- - ^) 2] - Fi) = 2 - Fi) - 0 

J-1 <-l i-1 

Similarly the corresponding term with X and Y interchanged 
vanishes. The fundamental identity (43) expresses the fact that 
the total sum of products of deviations is equal to the sum of 
products of deviations within classes plus the sum of products of 
the deviations of the class means from the general means, multi¬ 
plied by the number of individuals in each class. 

The sum of products can be broken up into more components 
just as can the sum of squares. It is not necessary to illustrate 
further subdivision, however, as it can be effected in a manner 
entirely finalogous to the subdivision in the analysis of variance. 

As an example of the analysis of covariance, consider the 
following data on the yield of wheat in bushels per acre and the 
production cost per bushel for five farms in each of three districts. 


TABLE 44 


District I 

District 11 

District III 

Yield (bush- 

Cost per 

Yield (bush- 

Cost per 

Yield (bush- 

Cost per 

els per acre) 

bushel 

els per acre) 

bushel 

els per acre) 

bushel 

9 


18 

$1.00 

14 

$1.00 

11 


18 

0.50 

10 

1.50 

8 


20 

0.70 

13 

0.80 

9 


16 

O.VO 

15 


11 

1.70 

9 

2.00 

7 

2.10 

















ANALYSIS OF COVARIANCE 


153 


Designating yield by X and cost by Y, we form Table 45, in 
which are also listed XY, Y^, 

TABLE 46 



X 

Y 


XY 



9 

1 80 

81 

16.2 

3.24 


11 

1.40 

121 

15.4 

1.96 

District I 

8 

2.00 

64 

16.0 

4.00 


9 

1.60 

81 

13.6 

2.26 


11 

1.70 

121 

18.7 

2.89 

Subtotal. 

48 

8.40 

468 

79.8 

14.34 


18 

1.00 

324 

18.0 

1.00 


18 

0.60 

324 

9.0 

0.25 

District II 

20 

0.70 

400 

14.0 

0.49 


16 . 

0.70 

266 

11.2 

0.49 



2.00 

81 

18.0 

4.00 

Subtotal. 

B 

4.90 

1385 

70.2 

6.23 


14 

1.00 

196 

14.0 

1.00 


10 

1.60 

100 

16.0 

2.25 

District III 

13 

0.80 ' 

169 

10.4 

0.64 


16 

0.70 

225 

10.6 

0.49 


7 

2.10 

49 

14.7 

4.41 

Subtotal. 

69 

6.10 

739 

64.6 

8.79 

Total. 

188 

19.40 

2592 

214.6 

29.36 


TABLE 46 
Among Districts 


Oistrict 

X 

Y 


XY 

y2 

• I 

48 




70.66 

II 

81 


6,561 

396.9 

24.01 

III 

69 

6.10 

3,481 

359.9 

37.21 

Total. 

188 

19.40 

12,346 

1160.0 

131.78 

[ 




























154 


ANALYSIS OF VARIANCE 


In computing, from Tables 45 and 46, the stuns of products of 
deviations and the sums of squares of deviations, we make use of 
the formulas 


Sa:* = SX2 


(SX)* 
N ’ 


Sxy = SXF - 


sx-sr 

N ’ 


V 


sy2 - 


(SF)* 

N 


For the total we have 


Sa;2 = 2592 - (188)2/15 = 2592 - 2356.26 = 235.7S 

lixy = 214.6 - 188 X 19.40/15 = 214.6 - 243.146 = - 28.546 

2 j/2 = 29.36 - (19.4)2/15 = 29.36 - 25.0906 = 4.2693 


Among districts, 

2 x 2 = 1x12346-^(188)2 = 2469.2- 2356.26=112.96 
Xxy = iX1160.0-^X188X19.4 = 232 -243.146=-11.146 
2 j ,2 = 1x131.78-^(19.4)2 = 26.356 - 25.0906=1.2656 
Within districts, 

I, 2x2 = 468 - (48)2/5 = 468 - 460.8 = 7.2 
2xj/ = 79.8 - 48 X 8.4/5 = - 0.84 
2 j /2 = 14.34 - (8.4)2/5 = 14.34 - 14.112 = 0.228 

II. 2x2 = 1385 - (81)2/5 = 1385 - 1312.2 = 72.8 
2xy = 70.2 - 81 X 4.9/5 - 70.2 - 79.38 = - 9.18 
2 y 2 = 6.23 - (4.9)2/5 = 6.23 - 4.802 = 1.428 

III. 2x2 = 739 _ ( 59 ) 2/5 = 739 _ 595.2 = 42.8 

2xj/ = 64.6 - 59 X 8.1/6 = 64.6 - 71.98 = - 7.38 


2j/2 = 8.79 - (6.1)2/5 = 8.79'- 7.442 = 1.348 


, Cktmpiling these results, we have Table 47. From this table we 



ANALYSIS OF COVARIANCE 


155 


can calculate the regression coefficients for the different compo¬ 
nents. Within districts we find 

61 

while among districts 

62 

TABLE 47 

Analysis of Variancje and Covariance 




^xy 

2/ 

Degrees of 
freedom 

District I 

7.2 

-0.84 

0.228 

4 

II 

72.8 

-9.18 

1.428 

4 

III 

42.8 

-7.38 

1.348 

4 

Within districts. . . 

122.8 

-17.40 

3.004 

12 

Among districts... 

112.93 

-11.146 

1.2653 

2 

Total. 

233.73 

-28.546 

4.2693 

14 


17.40 

122.8 


= - 0.1417 


11.146 

112.9S 


= - 0.0987 


Let us first test whether the coefficient of regression within dis¬ 
tricts is significant. We may use the t test of section 43, or may 
use the analysis of variance method. Let us use the latter. For 
the sum of squares of deviations from the regression line, 2 /' = hix, 
we have 

S(y - hixY = - 26 iSa:y + bfSx* 

which, since hi = ^xy/Xx^, reduces to 

= 3.004 - = 3.004 - 2.46547 = 0.53853 


The analysis is summarized in Table 48. 













156 


ANALYSIS OF VARIANCE 


TABLE 48 

Analysis op Variance Within Districts 



Sum of squares 
of deviations 

Degrees of 
freedom 

Mean square 
deviation 

Residuals. 

0.53853 

11 

0.0490 

Regression. 

2.46547 

1 

2.4655 

Total. 

3.004 

12 



We find 2 = H log€ (2.4655/0.0490) == 1.959, which for ni = 1 
and 712 = 11 is highly significant. 

Next we analyze the residual variance. For the regression 
among districts we have, as above, 

Zto - W - I.265S - 

= 1.265S - 1.10019 => 0.16514 
The sum of squares of differences between regressions is * 
(-17.40)2 (-11.146)2 (-28.546)2 

122.8 112.96 235.73 

= 2.46547 + 1.10019 - 3.45692 = 0.10874 
The sum of squares of the deviations among districts is 
0.16514 + 0.10874 = 0.27388 
The sum of squares of deviations for the “ total ” is 

4.2693 - = 4.2693 - 3.45692 = 0.81241 

235.73 

and the corresponding sum within districts has already been found 
to be 0.53853. 

These results are recapitulated in Table 49. Any desired tests 
can be made by computing the appropriate values of z, using the 


* See Wishart and Sanders, op. cU, 











EXERCISES 


157 


mean square deviation within districts as the error mean square. 

TABLE 49 

Analysis op Residual Variance 



Sum of squares 
of deviations 

Degrees of 
freedom 

Mean square 
deviation 

Between regressions. 

Residuals from the 

0.10874 

1 

0.10874 

*‘among” regression. 

0.16514 

1 

0.16514 

Among districts. 

0.27388 

2 

0.13694 

Within districts. 

0.53853 

11 

0.04896 

Total. 

0.81241 

13 



To be noted is the fact that the estimated standard deviation 
has been decreased from (3.004/12)^ = 0.503, to (0.04896)^ = 
0 .221; that is, it has been cut to less than half, thus greatly increas¬ 
ing the accuracy of any tests made. 


EXERCISES 

1. (a) From the data of Table O, page 63, can it be concluded that either 
sex has a greater variability in number of red blood cells than the other? 
(6) Can it be concluded that either sex has a greater variability in amount of 
hemoglobin? 

2. Analyze the variance for the linear regression found in exercise 2, 
page 43, and make a test of significance. 

3. Make an analysis of variance for the parabolic regression obtained in 
exercise 5, page 45, and make tests of significance. 

4. Analyze the variance for the multiple regression obtained in exercise 8, 
page 45, and make a test of significance. 

8 . (a) Analyze the variance for the correlation ratio found in exercise 3, 
page 62. (6) Test the significance of this correlation ratio. (c) Test the 

linearity of the regression. 

6. Make an analysis of variance for the correlation ratio found in exercise 4, 
page 64. Test (o) the significance of the correlation ratio, and (b) the linearity 
of the regression. 

7. Analyze the variance in Table W into that within batches and that 
among batches, and test for significance. 











158 


ANALYSIS OF VARIANCE 


TABLE W 

Brbaking Strength (Pounds Tension) of 10 Batches op CBin&NT 
Briquettes 


1 

2 

3 

4 

6 

6 

7 

8 

9 

10 

618 

608 

664 

656 

636 

644 

678 

630 

690 

642 

660 

674 

698 

667 

492 

502 

532 

664 

654 

666 

638 

628 

679 


628 

648 

662 

636 

630 

690* 

610 

634 

638 

536 

672 

662 

624 

640 

672 

646 

644 

638 

644 


506 

634 

648 


625 

622 


8 . Table Y gives porosity readings on 3 lots of condenser paper. There 
are 3 readings on each of 9 rolls from each lot. Determine whether there are 
significant variations (a) among readings within rolls, (b) among rolls within 
lots, and (c) among lots. (Western Electric Co. data.) 

TABLE Y 

Porosity Readings on Condenser Paper 


Lot 

Reading 

Roll number 

number 

number 

1 

2 

3 

4 

6 

6 

7 

8 

9 


1 

1.5 

1.6 

2.7 

3.0 

3.4 

2.1 

2.0 

3.0 

6.1 

I 

2 

1.7 

1.6 

1.9 

2.4 

6.6 

4.1 

2.5 

2.0 

6.0 


3 

1.6 

1.7 

2.0 

2.6 

6.6 

4.6 

2.8 

1.9 

4.0 


1 

1.9 

2.3 

1.8 

1.9 

2.0 

3.0 

2.4 

1.7 

2.6 

II 

2 

1.5 

2.4 

2.9 

3.5 

1.9 

2.6 

2.0 

1.6 

4.3 


3 

2.1 

2.4 

4.7 

2.8 

2.1 

3.6 

2.1 

2.0 

2.4 


1 

2.5 

3.2 

1.4 

7.8 

3.2 

1.9 

2.0 

1.1 

2.1 

III 

2 

2.9 

6.6 

1.6 

6.2 

2.6 

2.2 

2.4 

1.4 

2.6 


3 

3.3 

7.1 

3.4 

6.0 

4.0 

3.1 

3.7 

4.1 

1.9 

-A_ 

















EXERCISES 


159 


9. Table Z gives impact strength readings, in foot-pounds, on 5 lots of 
insulating material. One specimen from each of 20 different sheets was tested 
from each lot. The first 10 specimens were cut along the lengthwise direction 
of the sheets, the second 10 specimens were cut along the crosswise direction. 
Determine whether there are significant variations (a) among lots, and (6) 
between lengthwise and crosswise specimens. (Western Electric Co. data.) 

TABLE Z 

Impact Strength Readings (Foot-Pounds) on 5 Lots op Insulating 

Material 



Specimen 

Lot number 


number 

I 

II 

III 

IV 



V 


1 

1.16 

1.16 

0.79 

0.96 

0.49 


2 

0.84 

0.85 

.68 

.82 

.61 


3 

.88 

1.00 

.64 

.98 

.69 


4 

.91 

1.08 

.72 

.93 

.51 

Lengthwise 

5 

.86 

0.80 

.63 

.81 

.63 

specimens 

6 

.88 

1.01 

.59 

.79 

.72 


7 

.92 

1.14 

.81 

.79 

.67 


8 

.87 

0.87 

.66 

.86 

.47 


9 

.93 

0.97 

.64 

.84 

.44 


10 

.95 

1.09 

.75 

.92 

.48 


11 

0.89 

0.86 

0.52 

0.86 

0.52 


12 

.69 

1.17 

.52 

1.06 

.53 


13 

.46 

1.18 

.80 

0.81 

.47 


14 

.85 

1.32 

.64 

.97 

.47 

Crosswise 

15 

.73 

1.03 

.63 

.90 

.67 

specimens 

16 

.67 

0.84 

.68 

.93 

.54 


17 

.78 

0.89 

.65 

.87 

.56 


18 

.77 

0.84 

.60 

.88 

.65 


19 

.80 

1.03 

.71 

.89 

.45 


20 

.79 

1.06 

.69 

.82 

.60 






160 


ANALYSIS OF VARIANCE 


10. The following data are the thicknesses of coating, in 0.0001 in., on 
fiber strips sprayed with varnish. Measurements were taken at 6 different 
points on each of the 3 strips selected from each of 6 lots. Test for significance 
the variance (a) among points within strips, (b) among strips within lots, and 
(c) among lots. (Western Electric Co. data.) 

Lot I Lot II 


Strip 

number 

Point number 

1 

2 

3 

4 

5 

1 

10 

8 

10 

9 

7 

2 

8 

8 

8 

8 

10 

3 

8 

10 

10 

6 

7 


Strip 

number 

Point number 

1 

2 

3 

4 

5 

1 

13 

12 

12 

12 

13 

2 

10 

9 

13 

11 

8 

3 

11 

8 

10 

12 

12 


Lot III Lot IV 


Strip 

number 

Point number 


Strip 

number 

Point number 

1 

2 

3 

4 

5 

1 

2 

3 

4 

5 

1 

12 

13 

14 

17 

16 


1 

14 

13 

17 

11 

11 

2 

17 

10 

13 

10 

14 


2 

11 

9 

13 

11 

12 

3 

12 

11 

13 

16 

13 


3 

17 

13 

14 

13 

8 


Lot V 

Strip 

Point number 

number 

B 

2 

3 

4 

5 

1 

9 

13 

17 

13 

11 

2 

8 

11 

10 

12 

11 

3 

7 

14 

14 

9 

9 











EXERCISES 


161 


11. Table AA gives the initial weights and the average daily gains of 4 lots 
of 10 pigs each. Each lot was fed on a different diet. Make an analysis of 
variance and covariance, and test the significance of the differences among 
the adjusted or residual mean gains of the different lots. 

TABLE AA 

Initial Weights (X Pounds) and Average Daily Gains 
(V Pounds per Day) of 4 Lots op Pigs 


Lot 1 

Lot 2 

Lot 3 

Lot 4 

Initial 

Daily 

Initial 

Daily 

Initial 

Daily 

Initial 

Daily 

weight 

gain 

weight 

gain 

weight 

gain 

weight 

gain 

X 

V 

X 

V 

X 

r 

X 

r 

36 

1.33 

38 

1.25 

45 

1.22 

38 

1.35 

65 

1.13 

60 

1.39 

69 

1.79 

73 

1.60 

44 

1.80 

41 

1.67 

38 

1.31 

' 40 

1.26 

61 

1.48 

60 

1.29 

53 

1.50 

43 

1.15 

66 

1.76 

61 

1.25 

50 

1.31 

44 

1.64 

44 

1.42 

44 

1.20 

46 

1.65 

48 

1.24 

67 

1.90 

60 

1.40 

66 1 

1.70 

60 

1.47 

79 

1.67 

71 

1.29 

61 

1.40 

62 

1.29 

41 

1.31 

63 

1.30 

39 

1.36 

61 

1.41 

67 

1.34 

64 

1.46 

59 

1.63 

58 

1.36 


CHAPTER IX 
EXPERIMENTAL DESIGN 

66 . Randomized blocks. The realization is becoming stronger 
that, in order to yield the best results and the greatest possible 
information, an experiment should be properly planned before it 
is performed. The development of the analysis of variance and 
the improvement of experimental design have proceeded simul¬ 
taneously. In agricultural science particularly, it has been found 
desirable to design experiments so that the analysis of variance 
can be conveniently and correctly applied. 

One field arrangement that has been found extremely useful, 
and at the same time specially suited to the application of the 
analysis of variance, is that of randomized blocks. Consider, for 
simplicity, an experiment which is to be made on three varieties of 
wheat to ascertain which has the greatest yield in bushels per acre. 
We might take four blocks of land, divide each block into three 
strips, and sow each strip of a given block with a different variety, 
arranging them at random. Such an experiment would be repre¬ 
sented by the following diagram, in which viy V 2 , V 3 are the three 
varieties, and the numbers shown in the various strips are the 
yields in bushels per acre. 


12 3 4 



162 







RANDOMIZED BLOCKS 


163 


When the data are arranged in tabular form we have Table 50. 
TABLE 60 

Yields in Bushels per Acre of 3 Varieties of Wheat 


Variety 

Block 

Total 

1 

2 

3 

4 

1 

8 


16 

12 

46 

2 

11 

18 

9 

14 

62 

3 

9 

18 

20 

7 

64 

Total. 

28 

46 

45 

33 

152 


They can be analyzed as before by formulas (40), (41), and (42) 
of section 64. 

The total sum of squares of deviations is 


= (8)2 + (10)2 + (10)2 + (12)2 + (11)2 + (18)2 + (9)2 

+ (14)2 + (9)2 + (18)2 + (20)2 + (7)2 _ (1^ 

= 2140 - 1925.3 = 214.6 


The sum of squares of block differences (due to differences in 
soil fertility, etc.) is 

- X)2 = - 2,(S<X,,)2 - i (S,Sjr,;)2 

m Jy 

= §[(28)2 + (46)2 + (45)2 + (33)2] _ ^(152)2 
= 2004.6 - 1925.6 = 79.6 
*For variety differences we find 

kZiiXi. - ■^)* = I S.(S/^«)" - ^ (S,2iX«)2 

- §[(46)2 + (52)2 + (54)2) _ ^(152)2 

- 1934 - 1925.6 = 8.6 
















164 


EXPERIMENTAL DESIGN 


The error term is 

214.6 - 79.S - 8.6 = 126.6 
so that the analysis of variance table is as follows: 

TABLE 61 



Sum of squares 
of deviations 

Degrees of 
freedom 

Mean square 
deviation 

Blocks. 

79.3 

4-1=3 

26.4 

Varieties. 

8.6 

3-1=2 

4.S 

Error. 

126.6 

3X2=6 

21.i 

Total. 

214.6 

12 - 1 = 11 



The mean square deviation for varieties is less than that for 
error. 

A similar analysis could be made if different treatments were 
applied to the same variety. 

If each variety in the foregoing illustration were treated with 
two different kinds of fertilizer we should have six different com¬ 
binations 

Vltlj Vit 2 j V 2 tl, V2t2f Vzh, Vafe 

where of course v refers to variety and t to treatment. Each block 
would be divided into six different plots and a 
typical block would be that shown in the diagram. 
That is, each block would be divided into three 
strips, to which would be assigned at random the 
three varieties. Each strip would be subdividied 
into two plots, and the treatments would be al¬ 
located to them at random. The results could be tabulated as 
in Table 52. 

These data could be analyzed in precisely the same way as were 
the phenological data (first flowering dates of plants). 


V2tl 

V2 t2 

vih 

Vl tl 

nh 

Vs h 

















RANDOMIZED BLOCKS 


165 


TABLE 62 

Yields in Bushels per Acre of Three Varieties of Wheat 
Treated with Two Different Fertilizers 



Block 

Total 







1 

2 

3 

4 


Variety 1 
Treatmentj^’ ‘ 

8 

7 

10 

8 

16 

12 

12 

13 

46 

40 

Total. 

15 

18 

28 

25 

86 

Variety 2 
Treatment 12 

11 

10 

18 

16 

9 

10 

14 

12 

52 

48 

Total. 

21 

34 

19 

26 

100 

Variety 3 
Treatmentj^' * 

9 

10 

18 

20 

20 

20 

7 

10 

54 

60 

Total. 

19 

38 

40 

17 

114 


All varieties 

Treatmentj^’ ’ 

28 

27 

46 

44 

45 

42 

33 

35 

152 

148 

Total. 

65 

90 

87 

68 

300 



It should be noted that when we have only two subdivisions, as 
for treatments in the foregoing illustration, the sum of squares of 
deviations can sometimes be calculated more simply as follows: 
The totals 152 and 148 represent 12 plots each, and according to 
the*method already employed the sum of squares would be 

tV[(152)=> + (148)2] - A(300)2 = 0.6 

However, the difference between 152 and 148 represents 24 plots, 
and we obtain the same result by taking 

^(152 - 148)2 = 0.6 







166 


EXPERIMENTAL DESIGN 


In general, the sum of squares of deviations of two quantities Xi 
and X 2 from their mean is 

(Xi - X)* + (X 2 - X)* = (1) 

Their variance is (Xi — ^2)^/4 and their standard deviation 

IX1-X2I/2. 


67. Latin square. One arrangement frequently used is the 
Latin square. If we are testing m varieties (or treatments), a 
block of land is divided into a checkerboard arrangement of m 
rows and m columns, and the varieties are distributed at random 
in the plots, with the restriction that each variety occurs once 
and but once in each row and also in each column. If A, B, C, 
Z), E are five varieties, we can form a five-by-five Latin square, a 
typical example of which is shown in Fig. 12. It will be noted 


c 

A 

E 

D 

B 

D 

B 

A 

E 

C 

A 

D 

C 

B 

E 

B 

E 

D 

C 

A 

E 

C 

B 

A 

D 


Fiq. 12.—Latin Square. 


that each letter occurs once and only once in each row and each 
column. The fundamental identity for the Latin square of order 
m (m rows, m columns, m varieties) is 

2 S ~ ~ ' 

1-1 i-i ftr >-i 


m mm'- 

(X«-Xi.-X.,-X* + 2X)2 


(2) 




LATIN SQUARE 


167 


In this formula i refers to row, j to column, and k to variety* (or 
treatment). That is, is the item in the ith row and jth column, 
Xi- is the mean of the zth row, X.j the mean of the jth column, 
Xk the mean of the A:th variety, and X the general mean. The 
analysis of variance table showing the degrees of freedom is given 
below (Table 53). 

TABLE 63 

Analysis op Variance for a Latin Square op Order m 



Sum of squares of deviations 

Degrees of freedom 

Rows. 


m — 1 

Columns.... 

- Xf 

w — 1 

Varieties. . . . 

mXt(Xt - Xf 

m — 1 

Error. 

2iS, (Xi, - Xi. - X.i - Xt + 2Xf 

1 

J, 

1 

J, 

Total. 

SiS/(X.v - X)^ 

W" — 1 


The variation in the totals of rows and of columns gives an 
indication of the amount of soil heterogeneity running in two 
directions at right angles to each other, and it is the object of the 
Latin square arrangement to remove the effect of this heteroge¬ 
neity. For a square larger than eight-by-eight, the rows and 
columns tend to become too long, and the efficiency of the design 
is impaired. For a four-by-four square there are three degrees 
of freedom for varieties and six for error, and we find, by reference 
to Snedecor^s tables, that the mean square deviation for varieties 
can be judged significant at the 5 per cent level if it is 4.76 times 
that for error. But for a three-by-three square there are only two 
degrees of freedom for varieties and the same number for error, so 
that the variety mean square will have to be nineteen times the 
error mean square before it can be judged significant. Conse- 
» quently, squares larger than eight-by-eight or smaller than four- 
by-four are not recommended in practice. 

For the sake of simplicity, however, we shall give the analysis 

* The subscript k would refer to the variety occurring in row i and column j. ^ 












168 


EXPERIMENTAL DESIGN 


of a three-by-three square. In this square (Fig. 13) the r's in the 
compartments refer to varieties of wheat; the numbers in the 
compartments are the corresponding yields in bushels per acre. 



Fig. 13.—Latin Square Showing Yields of Three Varieties of Wheat. 


The calculation of the sum of squares of deviations is as follows: 
Total 

( 11)2 + ( 9)2 ^ ( 8)2 + ( 10)2 + ( 18)2 + ( 18)2 + ( 20)2 + ( 16)2 

+ (9)® - = 1751 - 1573.4 => 177.S 

Rows 

J[(28)2 + (46)2 + (45)2] - i(119)2 = 1641.6 - 1573.4 = 68.5 
Columns 

^[(41)2 + (43)2 -f (35)2] _ .^(119)2 = 1585 - 1573.4 = 11.6 
Varieties 

il(34)2 ^ (38)2 + (47)2] _ ^(119)2 = 1603 - 1573.4 « 29.6 
Error 

177.6 - (68.5 + 11.6 + 29.6) « 177.6 - 109.3 * 68.5 

We have calculated the error term as a remainder, in which 
way it is usually found. For illustrative purposes we shall also 
calculate 




FACTORIAL DESIGN AND ORTHOGONALITY 


169 


XiMXii - ^i. - X.i -Xt + 2X)2 

= (11-^-^-^+2X'4^P + (9-^-^-^+2XiF)* 
+ (8-’^-¥-^+2XH^)2+(10-^-V^-^+2XH^)* 
+ ( 18 -^-^-^+ 2 XH^) 2 +( 18 -^-^-^+ 2 X- 4 ^)* 
+(20-:V^-V‘-^+2X-4^)*+(16-^-^-^+2XH^)* 
+9-¥-¥-¥+2X'4^)2 

= ^ [(16)* + (-35)* + (19)* + (-35)* 

+ (19)* + (16)* + (19)* + (16)2 + (-35)*] 

= = 68.S 

The analysis of variance table (Table 54) follows. 

TABLE 54 

Analysis of Vamancb Table for Latin Square of Wheat Yields 



Sum of squares 
of deviations 

Degrees of 
freedom 

Rows. 

68.5 

3-1-2 

Columns. 

11.6 

3-1-2 

Varieties. 

29.6 

3-1-2 

Error. 

68.5 

2X1*2 

Total. 

177.6 

3* - 1 = 8 


68. Factorial design and orthogonality. In any sort of experi¬ 
ment it is usually better to vary several factors simultaneously 
rather than one at a time. For example, in an agricultural experi¬ 
ment on yields it is better to test several varieties with several dif¬ 
ferent kinds of fertilizer, and even with varying degrees or levels of 
each kind of fertilizer. For if four different varieties were being 
tested with, the same fertilizer we might find, for example, that the 
second variety gave the ^greatest yield. A separate investigation 
on this variety with three kinds of fertilizer might show that the 
first kind of fertfiizer gave the best results. However, from the§e 














170 


EXPERIMENTAL DESIGN 


two experiments we could not tell but that the second kind of 
fertilizer, say, when applied to the fom*th variety would give still 
better results. If the experiments were all combined into one, in 
which all four varieties are tested in conjunction with all three 
kinds of fertilizer, much more information will be elicited, since a 
large share of it is contained in the interactions among the various 
factors at work. 

The method of experimentation in which two or more sets of 
factors, such as treatments and varieties, are taken in all combina¬ 
tions has been called factorial design* Factorial design had its 
inception in agriculture, and its greatest development has taken 
place in that science. However, it should doubtless find applica¬ 
tion in biology and medicine, and in testing materials and manu¬ 
factured goods. 

The usefulness of the analysis of variance in testing significance 
in factorial design consists in the multiplicity of ways in which the 
sum of squares of deviations can be split up. For example, con¬ 
sider the numbers Xi, X 2 , X^, Let us form the two linear expres¬ 
sions 

X1-X2 
X1 + X2- 2X3 
which, together with the sum 

Xi + X2 + X3 

constitute a mutually orthogonal ^t. This means that the sum of 
products of corresponding coefficients in any two members of the 
set is zero, e.g., lXl + (—1)X1 + 0X (—2) = 0. The term 
“ orthogonal ” comes from geometry; we recall that, if oi, hi, ci 
and 02 , 62 , C 2 are the direction numbers of two lines, these lines 
will be orthogonal (that is, perpendicular) if aia 2 + bib 2 + C 1 C 2 = 0 . 
It is worth noting that, if a linear function of the X^s is orthogonal 
to their sum, then the sum of the coefficients of this linear function 

*See R. A. Fisher, “The Design of Experiments,” Oliver and Boyd, 
Edinburgh and London; also F. Yates, “Complex experiments,” Supplement 
to the Jaumol of the Royal Statistical Sodetyt vol. 2,1935, pp. 181-247. 



FACTORIAL DESIGN AND ORTHOGONALITY 


171 


is zero. Thus, for the above linear functions, we have 1 — 1 = 0 
and 1 + 1 - 2 = 0. 

When we have a set of linear combinations of a series of num¬ 
bers, which, together with the sum of the numbers, constitutes a 
complete * mutually orthogonal set, we have one way of sub¬ 
dividing the sums of squares of deviations of these numbers from 
their mean. In the above example we have 

(Xi - X2Y (Xi + X2 - 2X3)^ 

12 + (-1)2"^ 12 + 12 + (-2)2 

= (Xi - Z )2 + (Z2 - Z )2 -I- (Zs - J )2 

as can readily be verified. The denominators on the left are the 
sums of squares of coefficients. 

There are many ways of forming orthogonal sets. Thus, with 
four variables we might, to enumerate three such sets, have 

X 1 -Z 2 Xi+Z2-Z3~X4 - 3 Z 1 -Z 2 +X 3 + 3 Z 4 

Z1+Z2-2Z3 X1-X2 Z1-Z2-Z3+Z4 

Z 1 +Z 2 +Z 3 - 3 Z 4 X 3 -Z 4 -X 1 + 3 Z 2 - 3 Z 3 +Z 4 

Each column, together with the sum, Xi + Z 2 + Z 3 -f Z 4 , 
constitutes a complete mutually orthogonal set. If the Z’s 
correspond to treatments, for example, any set may be regarded 
as comparisons, and, moreover, independent comparisons among 
them. (With four variables we must have three independent 
comparisons.) That is, if Xi is the total yield of plots treated 
with the first fertilizer, and so on, then examining the first set we 
see that Zi — Z 2 compares the yield due to the first treatment 
with that due to the second, Zi + -X ’2 — 2 X 3 compares the sum 
of the yields due to the first and second treatments with twice that 
dtle to the third, while Xi + X2 + X3 3X4 compares the yields 
of the first three treatments with three times that of the fourth. 

To see how this method of subdivision may be used to advan¬ 
tage in factorial design, Jet us consider an experiment in which we 

• By a complete set we mean one that contains just as many linear combi¬ 
nations (including the sum of the numbers) as there are numbers. 



172 


EXPERIMENTAL DESIGN 


axe testing three kinds of fertilizer, a, 6, c. We might not merely 
want to apply them alone, but in conjunction with one another, so 
that altogether we should have not three treatments, but eight, 
which might be designated by 

abc^ ab, ac, be, a, h, c, (1) 

the S 3 mabol (1) denoting the absence of application of a, h, and c. 
The symbol ahe, for example, may be regarded qualitatively as 
the treatment containing all three ingredients, or quantitatively 
as the total yield of plots treated with all three ingredients. We 
should have blocks of land with eight plots each, the treatments 
being scattered at random over the blocks, each block, however, 
containing all the treatments. A typical block is shown in the 
accompanying diagram. If we had five such blocks or replications, 


ab 

h 

(1) 

abc 

c 

he 

ac 

a 


the degrees of freedom in the analysis of variance would be as 
follows: 

Degrees of freedom 


Blocks. 6 — 1= 4 

Treatments. 8 — 1= 7 

Error. 4X7=28 


Total. 5X8-1= 39 


The number of degrees of freedom for treatments is the number of 
independent comparisons. 

The total or the mean yield of plots having the treatment abc 
could be calculated, similarly for ab and all the others, and we 
could determine whether the variation in these means was or was 
not accidental, and any one of them could be tested individually. 

Instead of considering the treatments as they are, we might be 
more interested in subdividing them in* another manner. For 
instance, instead of considering merely a as contrasted with (1) we 
might want a comparison of the yields of plots containing a, with 








FACTORIAL DESIGN AND ORTHOGONALITY ‘ 173 


or without any other ingredient, with the jrields of plots not con¬ 
taining a at all, that is, 

abc + a6 + oc + u — — 6 — c — (1) 

Or we might want a comparison of the effect of a in the presence 
of h with that of a in the absence of 6, viz., 

abc + ab — ac -- a — he — h c + (X) 

This is also a comparison of the effect of h in the presence of a with 
that of 6 in the absence of a. We can thus arrive at the seven 
independent comparisons: 

A - (a—l)(6+l)(c+l) = a5c+a6+ac—6c+o—6 —c—(1 ) 

B = (a+l)(& —l)(c+l) = a6c+a6—ac+6c-“a+6 —c—(1 ) 

C = (a+l)(6+l)(c—1) = a5c—a6+ac+^>c—a—6 +c—(1 ) 

AB = (a—l)(6—l)(c+l) = o5c+a6—ac--6 c—a—6 +c+(l) 

AC = (o—l)(5+l)(c—1) = abc—db+ac—hc—a+h—c+{l) 

BC = (a+l)(6—l)(c—1) = a6c--a6—ac+&c+a—6—c+(l) 
ABC = (a—1)(6—l)(c-~l) = abc—ab—ac—bc+a+h+c—(l) 

Note the method of obtaining each of them as a symbolic algebraic 
product. Note also that they are orthogonal among themselves 
and likewise to the blocks,* so that the sum of their squares 
divided by the sum of the squares of their coefficients is equal to 
the sum of squares of their deviations from their mean. To each 
of them corresponds one degree of freedom, and the seven degrees 
of freedom due to treatments are completely accounted for in a 
manner which has a useful interpretation in treatment contrast. 

As a numerical case, suppose that the total yields for five 
b]pcks are as follows, the unit being immaterial: 

obc = 48 o6 = 64 oc = 37 6c = 20 

a = 40 6 = 31 c = 25 (1) = 15 

* Since the total yield of a block is 

abc -|-ci6-f“Oc-|“6c-|"0”f“6-f-c + (1) • 



174 


EXPERIMENTAL DESIGN 


Then for the sum of squares of deviations for treatments we 
should have 

^[(48)2 + (54)2 + (37)2 + (20)2 + ( 40)2 + ( 31)2 + ( 25)2 + ( 15 ) 2 ] 

- (48 + 54 + 37 + 20 + 40 + 31 + 25 + 15)2 

5X8 

= I X 10400 - ^(270)2 = 2080 - 1822.5 = 257.5 

We take one-fifth of the square bracket because each total such as 
ahc represents five blocks. Similarly, we take one-fortieth of the 
total squared since there are five blocks of eight plots each. Cf. 
formula (33) of section 63. 

We also find 


A 

= 

48 

+ 

54 

+ 

37 

— 

20 

+ 

40 

— 

31 

- 25 

— 

15 

= 

88 

B 

= 

48 

+ 

54 

- 

37 

+ 

20 

- 

40 

+ 

31 

- 25 

- 

15 

= 

36 

C 

= 

48 

- 

54 

+ 

37 

+ 

20 

- 

40 

- 

31 

+ 25 

- 

15 


- 10 

AB 


48 

+ 

54 

- 

37 

- 

20 

- 

40 

- 

31 

+ 25 

+ 

15 


14 

AC 

= 

48 

- 

54 

+ 

37 

- 

20 

- 

40 

+ 

31 

- 25 

+ 

15 

= 

-8 

BC 

= 

48 

- 

54 

- 

37 

+ 

20 

+ 

40 

- 

31 

- 25 

+ 

15 

= 

- 24 

ABC 

= 

48 

— 

54 

— 

37 

— 

20 

+ 

40 

+ 

31 

+ 25 

— 

15 

ss 

18 


The sum of squares of these quantities is 
(88)2+(36)2-h(-10)2+(14)2+(-8)24-(-24)2-t-(18)2 = 10,300 

The sum of squares of coefiicients of the abc, ab, etc., in any 
expression such as A is 8, and since each dbc, ab, etc., is the total of 
5 blocks we must divide 10,300 by 5 X 8. This gives 257.5, which 
agrees with the value obtained above. 

69. Confounding. The treatment contrasts A, B, C are called 
main effects; the contrasts AB, AC, BC are called interao 
Urns of first order; ABC is an interaction of second order. If 
more ingredients are used or if different levels of ingredients are 
employed—double doses or triple doses—'the number of separate 
treatments is greatly increased, and to accommodate all of them 
inr each block would require blocks of large size, since the indi- 



CONFOUNDING 


175 


vidual plots of the blocks can not be too small.* But if the blocks 
were larger we should be more likely to encounter soil heterogeneity 
within blocks. One way by which this can be controlled is by 
cmfoundingy that is, by not completely replicating within each 
block. A simple example will make this clear. 

Suppose that in the experiment just described we use ten 
blocks instead of five but have only four plots in each block. If 
five of the blocks contain the treatments 

uhc, u, 5, c 

and the other five the treatments 

ab, ac, he, (1) 

the second-order interaction ABC is confounded with blocks. 
This interaction is not orthogonal to the blocks. For we recall 
that 

ABC = a6c — oil) — oc — 6c + a + 5 -f c — (1) 
and this is orthogonal neither to 

obc ® "h ,6 -f- c 

nor to 

a6 + oc + 6c + (1) 

The main effects and the first-order interactions are orthogonal 
to the blocks and are unaffected by block differences, since for 
any one of these there are two positive and two negative treatment 
combinations occurring in each block. The degrees of freedom 
for the analysis of variance would appear as follows: 


Degrees of freedom 

Blocks. 10 — 1 = 9 

Treatments {A, B, C, AB, AC, BC) . 6 

* Error.2(6 - 1)(4 - 1) = 24 

Total. 4X10-1=39 


* The language used is that of agricultural experiment, but it seems best 
to speak in concrete terms belonging to a specific science rather than to 
attempt to invent terms of greater generality, which, although of wider 
applicability, might not convey the concepts so well. 







176 


EXPERIMENTAL DESIGN 


The contrast ABC has been sacrificed but it has been determined 
experimentally that higher-order interactions are often unimpor¬ 
tant. In this case it is possible to evaluate the other six contrasts 
with whatever additional precision has been gained by eliminating 
soil heterogeneity through the use of smaller blocks. 

70. Partial confounding. We have seen that, in an experi¬ 
ment involving the ingredients a, 6, c in which a treatment may 
contain none, one, or two of these ingredients, it is possible to 
confound one of the treatment effects such as ABC with blocks, 
thus gaining greater precision by using homogeneous material, at 
the expense, however, of losing all information concerning the 
effect ABC. Instead of confounding this same interaction in all 
the blocks it would be possible to confound different interactions 
in different sets of blocks. This procedure is called partial covr 
founding. Some information will then be obtained on all the 
interactions, the loss of information being spread over several 
interactions instead of being confined to one. 

In the foregoing example we might confound each of the first- 
order interactions AJ5, AC, BC in a quarter of the blocks and the 
second-order interaction ABC in the remaining quarter. A 
complete set of blocks would then look like the accompanying 
diagram. 



Here the interaction AB is confounded in the first column of 

ti 

blocks, BC in the second column, CA in the third, and ABC in the 
fourth. 

To illustrate the use of the analysis of variance in partial con¬ 
founding let us take an even simpler eiCample, one in which we 
deal with only two ingredients a and b. We remember that the 
paain effects and the interaction are 







PARTIAL CONFOUNDING 


177 


A = (a - 1)(6 + 1) = ah + a-h- (1) 

B = (a + 1)(6 - 1) = a6 + 6 - a - (1) 

AJ5 = (o - 1)(6 - 1) = ab + (1) - a - 6 

Suppose that we confound A in half of the blocks and B in the 

other half, leaving AB unconfounded. With eight blocks we 
could have two replications as shown below.* 


I 

II 

III 

IV 

ah 


h 


ah 


a 

a 


(1) 


h 


(1) 

V 

II' 

III' 

IV' 

ah 


h 


ah 


a 

a 


(1) 


h 


(1) 


We note that A is confounded in blocks I, II, I', II', B in blocks 
III, IV, III', IV', and that AB is unconfounded, since from each 
block we use a plus sign with one treatment and a minus sign 
with the other. 

We can then calculate the AB sum of squares from all the 
blocks. A is calculated from blocks III, IV, III', IV'; B, from 
blocks I, II, I', II'. The sum of squares for error can be obtained 
by contrasting treatment differences in blocks I and I', II and II', 
III and III', IV and IV'. These contrasts give us four degrees of 
freedom; a fifth degree of freedom for error is involved in the 
error component obtained when abstracting AB. The sum of 
squares for blocks (seven degrees of freedom) can be obtained 
directly or may be resolved into contrasts between I and I', II 
and II', III and III', IV and IV' (one degree in each of these, or 
four degrees altogether); the contrast between I + II + I' + 
II' and III + IV + III',+ IV' (one degree) which is the blocks 
component obtained when evaluating AB; I + I' versus II + II' 

* They are not randomized in the diagram. 









178 


EXPERIMENTAL DESIGN 


(one degree); and III + III' versus IV + IV' (one degree). 
These last two contrasts are seen to be closely related to the treat¬ 
ment effects A and B, being the block differences with which they 
are partially confounded. 

The degrees of freedom table for the analysis of variance would 
then appear as follows: 

Degrees of Freedom 


Blocks. 7 

Treatments (A, B, AB) . 3 

Error. 5 

Total. 15 


This section will be concluded by a simple numerical illustra¬ 
tion of the set-up just described. Suppose that the values apper¬ 
taining to the plots are as shown below: 


I 


ab 

7 

a 

5 


12 


I' 

ab 

6 

a 

4 


10 


II 


b 2 


( 1 ) 2 
4 


II' 


b 3 


( 1 ) 2 
5 


III 


ab 

9 

b 

4 


13 

III' 


ab 

7 

h 

3 

10 


IV 


a 

8 

(1) 

2 


10 

iv- 


a 

6 

(1) 

2 


8 


The total sum of squares of deviations is 
72-|-6H2H2H92+4H82+22-|-6H4H3*-|-22+72-h3H6H22 

-TV(7+5+2+2+94-4-f8+2+6+4+3+2+7+3+6«f-2)2 

(72)* 

= 410 - ^ = 410 - 324 = 86 
16 

For blocks we find 

i[(12)* + 4* + (13)’ + (10)’ + (10)’ + S’ + (10)’ + 8*1 - ” 


359 - 324 35 















PARTIAL CONFOUNDING 


179 


As stated above, the AB sum of squares is calculated from all 
blocks. So as to be able to obtain the error component when 
abstracting AJ5, we make the following arrangement (Table 55): 


TABLE 56 



Blocks 


Treatments 

I, II 

III, IV 

Total 


I', II' 

III', IV' 


ah + (1) 

17 

20 

37 

0 + 6 

14 

21 

35 

Total. 

31 

41 

72 


The sum of squares of deviations can be calculated in the ordinary 
manner, but we shall here make use of the following formulas, 
which are applicable when we have only two divisions, and which 


TABLE 56 




Total 







Total 

Ti n 

T 


are simpler from the standpoint of computation. Consider 
Table 56. The sums of squares of deviations are: 

• For rows \{Ti — ^ 2 )^ 

For columns 

For error l[{Xn + ^ 22)2 - (X 12 + X2i)2] 

The first two of these follow from (1) of section 66; the second is 
readily demonstrated. In applying them to Table 55 we must 








180 


EXPERIMENTAL DESIGN 


realize that each difference of totals involves sixteen plots instead 
of four and that consequently we must multiply by Ke instead of H- 

TABLE 67 



Sum of squares of deviations 

Degrees of 
freedom 

Trftfttinpnt AJ?. 

1^(37-36)2-0.26 

1*1(31-41)2-6,25 

1 

RwkJ d+n+i'+ii') \ 

"“®“\-(in+iv+in'+iv') j 

1 

Error. 

AI(17+21)-(20+14)|2 = 1 

1 


Total. 

7.5 

3 



To check the total we calculate 

i[(17)2 + (20)2 + (14)2 + (21)2] ^ ^(72)2 ^ 7,5 

To calculate the sum of squares of deviations for treatment 
contrast A, we use only blocks III, IV, III', IV', since this contrast 
is confounded in the others. We find this sum to be equal to 

J[oi) + o - 6 - (1)]2 = i(9 + 7 + 8 + 6- 4- 3- 2-2)* 

= 1(19)2 = 45.125 

From blocks I, II, I', II' we find the sum of squares of devia¬ 
tions for treatment contrast B to be equal to 

|[ol> - o + 6 - (1)]2 = i(7 + 6- 5- 4 + 2 + 3- 2-2)2 

= i X 52 = 3.125 

We are now ready to abstract the remaining sums of squares 
of deviations for error. If there is a fertility difference between 
two similar blocks, such as III and III', it will be eliminated if we 
compare the differences in yield due to treatment, viz., ah — in 
these two blocks. These differences are 9 — 4 = 5, and 7 — 3 = 4, 
respectively. The sum of squares of deviations can be calculated 
either as 

1(52 4 . 42 ) _ J (5 4 . 4)2 0.25 

or as (5 — 4)2/4 0.25, the latter way being in general simpler. 









PARTIAL CONFOUNDING 


181 


The treatment differences for the various blocks are shown 
below. 

TABLE 68A TABLE 68B TABLE 68C TABLE 68D 



b-d) 

II 

II' 

2-2= 0 
3-2= 1 

Difference = — 1 



ah—a 

I 

7-6=2 

I' 

6-4=2 

Difference =0 



ab—h 

III 

iir 

9-4=5 
7-3 = 4 

Difference = 1 



a-(l) 

IV 

IV' 

8-2=6 

6-2=4 

Difference =2 


TABLE 68E 



ab — a — b + (1) 

I + II -f r+ ir 

III + IV + III' + IV' 

13- 9-54-4= 3 

16-14- 74-4=-! 

Difference = 4 


Thus for the sum of squares of deviations due to error we have 
the following analysis: 


TABLE 69 
Error 


Blocks 

Sum of squares of deviations 

Degrees of 
freedom 

I - I'. 

J X o’* =0 

ix i-lf = 0.26 

i X 1* = 0.25 

1X2’* =1 

1 

II - II'. 

1 

Ill •- III'. 

1 

IV - IV'. 

1 

(I+11+1'+II') 1 

-(III + 1V + III' + IV')I ■ 

• 

(See AB analysis) 1 

1 

Total. 

2.6 

6 

















182 


EXPERIMENTAL DESIGN 


Although the sum of squares of deviations among block means 
has already been calculated we shall for completeness show the 
analysis. 

TABLE 60 
Blocks 



Sum of squares of deviations 

Degrees of 
freedom 

I - r. 

i(12 - 10)* = 1 
i(4-6)* = 0.25 

i(13 - 10)* = 2.26 

J(10 - 8)* = 1 

(See AB analysis) 1 

1 

II - ir. 

1 

Ill - III'. 

1 

IV - IV'. 

1 

(I +11 + r + II') I 

-(III+IV + III' + IV')j* 

1 

(I -L I') _ (II 4- II'). 

J(22-9)* -21.125 
1(23-18)*= 3.125 

1 

(III +111') - (IV + IV').... 

1 1 

Total. 

35 

7 




Results are recapitulated in Table 61. 


TABLE 61 

Analysis of Variance fob Partial Confounding 


Blnnks.. 


36 


\AB . 

0.25 j 

Treatments • 

\a . 

46.126 

48.6 

[b . 

3 .I 25 J 

Error. 


2.5 




Sum of squares 
of deviations 


Degrees of 
freedom 


Total. 


86 


16 


71. Dummy treatments. Sometimes, in order to have each 
ingredient or factor occur with proportional frequency in com¬ 
bination with the variants of other factors, it becomes necessary 



















DUMMY TREATMENTS 


183 


to use so-called dummy treatments. For example, if we are inves¬ 
tigating three levels (degrees) and three kinds of nitrogenous 
fertilizer, the three plots of each replication receiving no nitrogen 
are indistinguishable. The accompanying diagram, representing 
such a replication, will make this clearer. The subscripts represent 
the level of the ingredient, that is, the quantities of it applied. The 


«0 

oo 

ao' 

aj 

a'l 

a[" 

ai 


a!% 


subscript 0 denotes that none has been applied, the subscript 1 
indicates a single application, the subscript 2 indicates a double 
dose. The primes, etc., represent different kinds of ingredients; 
for example, they might be sulphate of ammonia, chloride of am¬ 
monia, and cyanamide. It is evident that there is no difference 
in the oo^s; in fact, their superscripts should be deleted since they 
are meaningless. Thus we have not nine distinct treatments but 
only seven, and our degrees of freedom in the analysis of variance 
would be, if we had, say, five blocks: 


Blocks. 

Treatments. 

(Among blocks 
Within blocks. 

Total. 


Degrees of Freedom 

6- 1-4 

7- 1=6 
4X6 =24 

5(3 - 1)= 10 


5X9-1 =44 


If we wish to analyze the treatment degrees of freedom further 
we may do so as follows: two degrees for comparison of the three 
levels of a, i.e., ao, Ui, a 2 ] two degrees for comparison of the three 
kiritls of treatment, i.e., a', a", a'"; two degrees, (3 — 1)(2 — 1), 
for interaction between level and kind, i.e., 

/ n m 
a\ a\ Gi 

* f N m 
02 02 

(The zero level would not be included here.) 









184 


EXPERIMENTAL DESIGN 


The sum of squares of deviations for error within blocks is 
obtained by comparing the three dummy plots in each block. 
There are thus two degrees of freedom for each block, or ten 
altogether. 

The sum of squares of deviations for error among blocks is 
composed of three parts: 

(i) The remainder or interaction term obtained when compar¬ 
ing the three treatments aj, aj, d\ in the five blocks, the degrees 
of freedom being (3 — 1)(5 — 1) = 8. 

(w) The interaction term obtained when comparing the three 
treatments a^, dd in the five blocks, the degrees of freedom 
being eight as before. 

(in) The interaction term obtained when comparing the three 
levels of treatment oo, ai, a 2 in the five blocks, the number of 
degrees of freedom again being eight. 

It is quite possible to have dummy treatments in connection 
with confounding, but for this and for more complex examples the 
reader is referred to Fisher ♦ and Yates, f 

As a numerical example of dummy treatments consider the 
following two blocks, where the notation has the same meaning 
as earlier in the section; that is, subscripts refer to different levels 
of treatment and superscripts to different kinds of treatment. The 
numbers arc yields. 


Block I 


Block II 


Oo 

ao 

ao 

Total 

ao 

ao 

Oo 

2 

4 

2 

8 

3 

2 

2 

ai 

ai 

ai' 


oI 

9 $ 

Oi 

ai” 

6 

4 

3 

13 

7 

6 

5 

02 

02 

/// 

02 


02 

99 

02 

ad' 

7 

5 

9 

21 

8 

6 

9 


42 


Total 

7 

18 

48 


* R. A. Fisher, “ The Design of Experiments.” 

t F. Yates, “The principles of orthogonality and confounding in replicated 
experiments/’ Journal of Agricultural Science^ vol. 23, 1933, pp. 108-144; 
“implex experiments,” Supplement to the Journal of the Royal Statistical 
Sodetyt vol. 2,1935, pp. 181-247. 




DUMMY TREATMENTS 


185 


The plots are not arranged at random, as they would be in the 
field, but are placed in corresponding positions so that totals of 
corresponding treatments can more readily be obtained. 

We first calculate the total sum of squares of deviations, 
which is 

22 + 42 + 22 + 62 + 42 + 32 + 72 + 52 + 92 
+ 32 + 22 + 22 + 72 + 62 + 52 + 82 + 62 + 92 - (42 + 48)2/18 

= 548 - 450 = 98 

The number of degrees of freedom is the number of plots less one, 
viz., 17. 

For the difference between blocks we have 
(42 ^ 48)2 _ 

18 

with one degree of freedom. 

We now take up the three different levels of treatment: 


TABLE 62A 


mm 

oo 

fli 

02 

Total 

Block I. 

8 

13 

21 

42 

Block II. 

7 

18 

23 

48 

Total. 

15 

31 

44 

90 


The total sum of squares of deviations is 
1 r90')2 

3 [(8)2 + (13)2 + (21)2 + (7)2 + (18)2 + (23)2] ^ 

= 525.3 - 450 = 75.S 

wiiEh five degrees of freedom. 

The sum of squares of deviations for different levels of treat¬ 
ment is 

1 • f90'l2 

ft [(15)2 + (31)2 + (44)2] _ 520.S - 450 = 70.3 

O lo 

with two degrees of freedom. 






186 


EXPERIMENTAL DESIGN 


For block differences we have already found the sum of squares 
of deviations to be two, with one degree of freedom, and we can 
find the error term by subtraction: 

75.3 - (70.3 + 2) = 3 

the number of degrees of freedom being two. 

Next we consider the different kinds of treatment. Here the 
oo plots are, of course, left out of consideration. Since we wish to 
get the interaction between levels and kinds of treatment we form 
the following table, in which blocks I and II are combined: 


TABLE 62B 



a' 

o" 

o'" 

Total 

ai 

13 

10 

8 

31 

0,2 

15 

11 

18 

44 

Total. .. 

28 

21 

26 

75 


The total sum of squares of deviations for this group is 

I [(13)* + (10)* + (8)* + (15)* + (11)* + (18)*] - ^ 

= 501.5 - 468.75 = 32.75 

the number of degrees of freedom being five. Note that each 
value, such as 13, is the total of two plot-yields, so that the total 
of 75 is for 12 plots. 

For the sum of squares of deviations due to different kinds of 
treatment we obtain 

«> 

1 (7S)^ 

7 [(28)2 + (21)2 + (26)2] - ^ = 475.25 - 468.75 = 6.5 ^ 

4 12 

with two degrees of freedom. 

For the different levels in this group we find (31 — 44)2/12 
5^ 14.083, with one degree of freedom, so that for the interaction 




DUMMY TREATMENTS 


187 


term among the three kinds and the two levels of treatment we find 
32.75 - (6.5 + 14.083) = 12.16 

The corresponding number of degrees of freedom is (3 — 1)(2 — 1) 
= 2, or it can be found by subtraction: 5 — (2 + 1) = 2. 

This completes the analysis of the sum of squares duo to 
treatments, but we found only part of the error term, viz., the 
interaction between blocks and levels of treatments (three, with 
two degrees of freedom). Consequently we next proceed to isolate 
the error or interaction term from blocks and kinds of treatment 
at the first level. (See Table 62C.) 


TABLE 62C 



a[ 

n 

ax 

a'l 

Total 

Block I. 

6 

4 

3 

13 

Block II. 

7 

6 

5 

18 

Total. 

13 

10 

8 

31 


The sums of squares of deviations for this subgroup are: 

Total: 62 + 42 + 32 + 72 + 62 + 52 - (31)2/6 

= 171 — 160.16 = 10.83 (5 degrees of freedom) 


Treatment: ^[(13)2 + ( 10)2 + ( 3 ) 2 ] _ ( 31 ) 2/5 

= 166.5 — 160.16 = 6.3 (2 degrees of freedom) 

Blocks: (13 — 18)2/6 = 4.16 (1 degree of freedom) 

Error: 10.83 - (6.3 + 4.16) 

= O.i (2 degrees of freedom) 

Similarly we isolate the error term from blocks and kinds of 
treatment at the second level. 







188 


EXPERIMENTAL DESIGN 


TABLE 62D 


• 

02 

ai' 

ai” 

Total 

Block I. 

7 

5 

9 

21 

Block II. 

8 

6 

9 

23 

Total. 

16 

11 

18 

44 


Total: 72 + 52 + 92 + 82 + 62 + 92 - (44)2/6 

= 336 — 322.6 = 13.3 (5 degrees of freedom) 
Treatment: ^[(15)2 + ( 11)2 + ( 18 ) 2 ] - (44)2/6 

= 335 — 322.6 = 12.3 (2 degrees of freedom) 
Blocks: (21 — 23)2/6 = 0.6 (1 degree of freedom) 

Error: 13.3 - (12.3 + 0.6) 

= 0.3 (2 degrees of freedom) 


The components of error between blocks may be summarized 
as follows: 


TABLE 62E 



Sum of squares 
of deviations 

Degrees of 
freedom 

Blno.lrn X levels of treatment. 

3 

2 

Blocks X kinds at 1st level. 

0.3 

2 

Blocks X kinrlfi at 2nd level. 

0.3 

2 



Between blocks. 

3.6 

6 



It now remains only to find the components of error in the dummy 
treatments oo within blocks. These are * 

Block I 2* + 42 + 22 - 82/3 * 2.6 (2 degrees of freedom) 

Block II 32 + 2* + 22 — 72/3 =0.6 (2 degrees of freedom) 

Total within blocks 3.6 (4 degrees of freedom) 


. Collecting our results we have the following analysis: 














NON-ORTHOGONAL DATA 


189 


TABLE 63 

Analysis of Variance for Dummy Treatments 



Sum of squares 
of deviations 

Degrees of 
freedom 

Blocks. 


2 


1 

Treatments 





Levels. 

70.3 


2 


Kinds. 

6.5 

•89 

2 

6 

Levels X kinds. 

12.16 


2 


Error 





Between blocks.... 

3.6 

7 

61 


Within blocks. 

3.3 


4| 

lU 

Total. 

98 

17 


72. Non-orthogonal data. Non-orthogonality is sometimes 
deliberately introduced into an experimental design, as for exam¬ 
ple when confounding is resorted to. On the other hand, it may 
be unavoidable on account of the nature of the material, as for 
instance in poultry experiments in which the sex of the individual 
birds can not be determined when the experiment is begun, and 
when consequently the numbers of the different sexes in the 
various subclasses are not equal or not even proportional. When 
some of the animals in an experiment die during the progress of 
the experiment, or when some of the plots in a field trial are 
damaged, orthogonality is lost. 

Some of the modifications of the ordinary analysis of variance 
for the case of deliberately non-orthogonal data have been described 
in the earlier sections of this chapter. No attempt will be made 
to discuss other types of non-orthogonal data, but attention is 
called to several original papers dealing with the subject.* 

^ *S. S. Wilks, “ The analysis of variance and covariance in non-orthogonal 
data,” Metrorit vol. 13, 1938, pp. 141-*164. 

F. Yates, “The principles of orthogonality and confounding in replicated 
experiments,” Journal of Agricultural Science^ vol. 23, 1933, pp. 108-145; 
“ The analysis of replicated experiments when the field results are incomplete,” 
Empire Journal of Experi'^iental Agriculture^ vol. 1, 1933, 129-142; “The 
analysis of multiple classifications with unequal numbers in the different 
classes,” Journal of the American Statistical Association^ vol. 29, 1934, pp. 
61-66. 










190 


EXPERIMENTAL DESIGN 


EXERCISES 

1. Complete the analysis of variance for Table 62, p. 166. 

2. Figure 14 shows the yields, in pounds per plot, in an experiment in 
raising potatoes. Six different treatments were used in each of 4 different 
blocks. The random field arrangement is shown in the table, in which U 
indicates the treatment applied to the particular plot. Analyze the variance, 
and make the appropriate tests. 


Block I 


Block II 


Block III 


Block IV 


Fia, 14.—Yields of Potatoes in Pounds per Plot 
(Randomized Blocks) 

3. Figure 16 shows the actual arrangement of a Latin square used in a 
potato-growing experiment in which 4 different treatments, < 2 , h, Ut were 
applied. The numbers in the various cells are yields in pounds per plot. 
(Eden and Fisher, Journal of Agricultural Science^ vol. 19.) Analyze the 
variance, and make a test of significance. 


h 

306 

<4 

442 

<6 

2&d 

h 

290 

k 

457 

k 

349 


h 

253 

h 

415 

h 

297 

h 

288 

k 

268 

k 

434 


U 

419 

h 

307 

h 

178 

k 

310 

t6 

467 

k 

304 


U 

166 

428 

h 

308 

k 

404 

h 

172 

t2 

268 


k 

k 

h 

t2 

444 

422 

173 

398 

h 

<2 

k 

k 

279 

439 

423 

409 

k 

k 

k 

h 

436 

428 

445 

212 

t2 

h 

k 

k 

453 

237 

410 

393 


Fio. 16.—Yields of Potatoes in Pounds per Plot 
(Latin Square) 




EXERCISES 


191 


4. Figure 16 shows the plan, and the yield in quarter pounds, of an experi¬ 
ment in growing oats (Yates, Supplement to the Journal of the Royal Statistical 
Societyt vol. 2). There were 3 varieties, vi, V 2 , vg, and 4 treatments, the latter 
consisting of the application of 4 different levels of nitrogenous fertilizer, 
Wo, wi, W 2 , W3. Each block consisted of 3 whole plots, each of which contained 
1 of the 3 varieties. Each whole plot was subdivided into 4 subplots, each 
subplot containing a different level of nitrogen. Such a design is called a 
split-plot arrangement. Analyze the variance according to the following 
scheme, and make appropriate tests of significance: 

( Blocks. 

Varieties. 

Error. 


Subtotal. 

{ Nitrogen. 

Nitrogen X Varieties ... 
Error. 


Total 



Fig. 16. —^Experiment on Oats. Plan, and Yields in Quarter-Pounds , 













192 


EXPERIMENTAL DESIGN 


5. In Fig. 17 is shown the design of an experiment on peas. (Yates, 
Supplement to the Journal of ike Royal Statieticcd Society^ vol. 2). Three 
different kinds of fertilizer were used: nitrogen (n), phosphate (p), and 
potash (k). These were administered singly and in various combinations, 
including no fertilizer, which is indicated by the symbol (1). Half of the 
6 blocks contained the treatments (1), np, nk, pk; the other half, the treat¬ 
ments n, p, k, npk. The 3delds of the various plots are given in pounds. 
Perform an analysis of variance, and make any tests of significance that seem 
appropriate. 


npk 

V 

npk 

n 

np 

pk 

55.8 

62.8 

58.5 

59.8 

62.8 

49.5 

k 

n 

P 

k 

nk 

(1) 

55.0 

69.5 

56.0 

55.5 

67.0 

46.8 

np 

nk 

(1) 

np 

npk 

n 

59.0 

57.2 

51.6 

52.0 

48.8 

62.0 

(1) 

pk 

pk 

nk 

V 

k 

86.0 

53.2 

48.8 

49.8 

44.2 

45.5 


Fig. 17.—Experiment on Peas. Plan, and Yields in Pounds 




TABLES 


PAGE 


I. Probabilities and Ordinates of the Normal Curve Cor¬ 
responding TO Given Deviations. 194 

II. Deviations of the Normal Curve Corresponding to Given 

Probabilities. 198 

III. Probabilities of the Normal Curve Corresponding to 

Large Deviations. 199 

IV. Deviations of the Normal Curve Corresponding to Small 

Probabilities. 199 

V. Values of t Corresponding to Given Probabilities. 200 

VI. Values of Corresponding to Given Probabilities. 202 

VII. 5 Per Cent Points of the Distribution of z. 204 

VIII. 1 Per Cent Points of the Distribution of z . 206 


193 











194 


TABLES 


TABLE I 


Probabilities and Ordinates of the Normal Curve 

CORRESPONDINQ TO GiVEN DEVIATIONS 


X 

Probability 
of a deviation 
greater than x 

Ordinate 

(2ir)^ 

X 

Probability 
of a deviation 
greater than x 

Ordinate 

(2v)H 

0.00 

.6000 

.3989 

0.60 

.3085 

.3521 

0.01 

.4960 

.3989 

0.51 

.3060 

. 3503 

0.02 

.4920 

.3989 

0.52 

.3015 

.3485 

0.03 

.4880 

.3988 

0.53 

.2981 

. .3467 

0.04 

.4840 

.3986 

0.54 

.2946 

.3448 

0.05 

.4801 

.3984 

0.65 

.2912 

3420 

0.06 

.4761 

.3982 

0.66 

.2877 

3410 

0.07 

.4721 

.3980 

0.67 

.2843 

3301 

0.08 

.4681 

.3977 

0.68 

.2810 

3372 

0.09 

.4641 

.3973 

0.59 

.2776 

.3352 

0.10 

.4602 

.3970 

0.60 

.2743 

3332 

0.11 

.4562 

.3965 

0.61 

.2709 

3312 

0.12 

.4522 

.3961 

0.62 

.2676 

3292 

0.13 

.4483 

.3956 

0.63 

.2643 

3271 

0.14 

.4443 

.3951 

0.64 

.2611 

.3251 

0.15 

.4404 

.3945 

0.65 

.2578 

3230 

0.16 

.4364 

.3939 

0.66 

.2546 

3209 

0.17 

.4325 

.3932 

0.67 

.2514 

3187 

0.18 

.4286 

.3925 

0.68 

.2483 

3166 

0.19 

.4247 

.3918 

0.69 

.2451 

.3144 

0.20 

.4207 

.3910 

0.70 

.2420 

.3123 

0.21 

.4168 

.3902 

0.71 

.2389 

3101 

0.22 

.4129 

.3894 

0.72 

.2358 

.3079 

0.23 

.4090 

.3886 

0.73 

.2327 

3056 

0.24 

.4052 

.3876 

0.74 

.2296 

.3034 

0.25 

.4013 

.3867 

0.75 

.2266 

.3011 

0.26 

.3974 

.3857 

0.76 

.2236 

[2989 

0.27 

.3936 

.3847 

0.77 

.2206 

.2966 

0.28 

.3897 

.3836 

0.78 

.2177 

2943 

0.29 

.3859 

.3825 

0.79 

.2148 

.2920 

0.30 

.3821 

.3814 

0.80 

.2119 

.2897 

0.31 

.3783 

.3802 

0.81 

.2090 

*2874 

0.32 

.3745 

.3790 

0.82 

.2061 

[2860 

0.33 

.3707 

.3778 

0.83 

.2033 

2827 

0.34 

.3669 

.3765 

0.84 

.2005 

[2803 

0.35 

.3632 

.3762 

0.85 

. 1977 

.2780 

0.36 

.3594 

. 3739 

0.86 

. 1949 

[ 2756 

0.37 

.3667 

.3725 

0.87 

. 1922 

[ 2732 

0.38 

.3520 

.3712 

0.88 

.1894 

2709 

0.39 

.3483 

.3697 

0.89 

.1867 

.2685 

0.40 

.3446 

.3683 

0.90 

.1841 

.2661 

0.41 

.3409 

.3668 

0.91 

.1814 

.2637 

0.42 

.3372 

.3653 

0.92 

.1788 

[2613 

0.43 

.3336 

.3637 

0.93 

.1762 

.2689 

0.44 

.3300 

.3621 

0.94 

.1736 

.2565 

0.45 

.3264 

.3606 

0.95 

.1711 

.2541 

0.46 

.3228 

.3589 

0.96 

.1685 

.2616 

0.47 

.3192 

.3572 

0.97 

.1660 

.2492 

<1.48 

.3156 

.3665 

0.98 

.1635 

2468 

0.49 

.3121 

.3538 

0.99 

.1611 

[2444 


The probabUity of a deviation numerically greater than x is twice the 
probability given in the table. 



TABLES 


195 


TABLE I —Continued 

Probabilities and Ordinates of the Normal Curve 
Corresponding to Given Deviations 



The probability of a deviation numerically greater than x is twice the 


probability given in the table. 



































196 


TABLES 


TABLE I —Continued 


PBOBABILmEB AND OrDINATES OF THE NORMAL CuRVE 
Corresponding to Given Deviations 


z 

Probability 
of a deviation 
greater than z 

Ordinate 

(2r)H 

X 

Probability 
of a deviation 
greater than x 

Ordinate 

g— 

(2x)>^ 

2.00 

.0228 

.0640 

2.60 

.0062 

.0176 

2.01 

.0222 

.0629 

2.61 


.0171 

2.02 

.0217 

.0619 

2.62 


.0167 

2.03 

.0212 

.0508 

2.53 


.0163 

2.04 

.0207 

.0498 

2.64 

.0065 

.0168 

2.06 


.0488 

2.66 

.0064 

.0164 

2.06 


.0478 

2.66 

.0062 

.0161 

2.07 


.0468 

2.67 

.0061 

.0147 

2.08 

.0188 

.0469 

2.68 

.0049 

.0143 

2.09 

.0183 

.0449 

2.59 

.0048 

.0139 

2.10 

.0179 

.0440 


.0047 

• .0136 

2.11 

.0174 

.0431 

2.61 

.0046 

.0132 

2.12 


.0422 

2.62 

.0041 

.0129 

2.13 

.0166 

.0413 

2.63 

.0043 

.0126 

2.14 

.0162 

.0404 

2.64 

.0041 

.0122 

2.15 

.0168 

.0396 

2.65 

.0040 

.0119 

2.16 

.0164 

.0387 

2.66 

.0039 


2.17 

.0160 

.0379 

2.67 

.0038 

.0113 

2.18 

.0146 

.0371 

2.68 

.0037 

.0110 

2.19 

.0143 

.0363 

2.69 

.0036 

.0107 

2.20 

,0139 

.0366 


.0036 

.0104 

2.21 

.0136 

.0347 

2.71 

.0034 

.0101 

2.22 

.0132 

.0339 

2.72 

.0033 

.0099 

2.23 

.0129 

.0332 

2.73 

.0032 

.0096 

2.24 

.0126 

.0326 

2.74 

.0031 

.0093 

2.26 

.0122 

.0317 

2.75 

.0030 

.0091 

2.26 

.0119 

.0310 

2.76 

.0029 

.0088 

2.27 

.0116 

.0303 

2.77 

.0028 

.0086 

2.28 

.0113 

.0297 

2.78 

.0027 

.0084 

2.29 


.0290 

2.79 

.0026 

.0081 

2.30 

.0107 

.0283 

2.80 

.0026 

.0079 

2.31 

.0104 

.0277 

2.81 

.0026 

.0077 

2.32 

.0102 

.0270 

2.82 

.0024 

.0076 

2.33 


.0264 

2.83 

.0023 

.0073 

2.34 


.0258 

2.84 

.0023 

.0071 

2.35 


.0262 

2.86 

.0022 

.0069 

2.36 


.0246 

2.86 

.0021 

.0067 

2.37 


.0241 

2.87 

.0021 

.0065 

2.38 


.0236 

2.88 

.0020 

.0063 

2.39 


.0229 

2.89 . 

.0019 

.0061 

2.40 


.0224 


.0019 

.0060 

2.41 


.0219 

2.91 

.0018 

.0068 

2.42 


.0213 

2.92 

.0018 

.0066 

2.43 


.0208 

2.93 

.0017 

.0066 

2.44 

.0073 


2.94 

.0016 

.0063 

2.45 

.0071 

.0198 

2.96 


.0061 • 

2.46 

.0069 

.0194 

2.96 


.0060 

2.47 


.0189 

2.97 


.0048 

2.48 


.0184 

2.98 


.0047 

2.49 


.0180 

2.99 


.0046 


The probability of a deviation numerically greater than x is twice the 
probability given in the table. 
















































TABLES 


197 


TABLE I —Continued 


Probabilities and Ordinates op the Normal Curve 

CORRESPONDINO TO GiVEN DEVIATIONS 


X 

Probability 
of a deviation 
greater than x 

Ordinate 

(2ir)H 

X 

Probability 
of a deviation 
greater than x 

Ordinate 

(2r)H 

8.00 

.0013 


3.50 

.0002 

.0009 

3.01 

.0013 


3.61 

.0002 

.0008 

3.02 

.0013 


3.52 

.0002 

.0008 

3.03 

.0012 


3.53 

.0002 

.0008 

3.04 

.0012 


3.54 

.0002 

.0008 

3.06 

.0011 


3.55 

.0002 

.0007 

3.06 

.0011 


3.56 

.0002 

.0007 

3.07 

.0011 


3.57 

.0002 

.0007 

3.08 

.0010 


3.58 

.0002 

.0007 

3.00 

.0010 


3.59 

.0002 

.0006 

3.10 

.0010 


3.60 

.0002 

.0006 

3.11 

.0009 


3.61 

.0002 

.0006 

3.12 

.0009 


3.62 

.0001 

.0006 

3.13 

.0009 

.0030 

3.63 


.0006 

3.14 

.0008 

.0029 

3.64 


.0005 

3.16 

.0008 

.0028 

3.65 


.0005 

3.16 

.0008 

.0027 

3.66 


.0005 

3.17 

.0008 

.0026 

3.67 


.0005 

3.18 

.0007 

.0025 

3.68 

.0001 

.0005 

3.10 

.0007 

.0025 

3.69 

.0001 

.0004 

3.20 

.0007 

.0024 

3.70 

.0001 

.0004 

3.21 

.0007 

.0023 

3.71 

.0001 

.0004 

3.22 

.0006 

.0022 

3.72 

.0001 

.0004 

3.23 

.0006 

.0022 

3.73 

.0001 

.0004 

3.24 

.0006 

.0021 

3.74 

.0001 

.0004 

3.26 

.0006 

.0020 

3.75 

.0001 

.0004 

3.26 

.0006 

.0020 

3.76 

.0001 

.0003 

3.27 

.0005 

.0019 

3.77 

.0001 

.0003 

3.28 

.0006 

.0018 

3.78 

.0001 

.0003 

3.20 

.0006 

.0018 

3.79 

.0001 

.0003 

3.30 

.0005 

.0017 

3.80 

.0001 


3.31 

.0005 

.0017 

3.81 

.0001 


3.32 

.0006 

.0016 

3.82 



3.33 

.0004 

.0016 

3.83 

.0001 


3.34 

.0004 

.0015 

3.84 

.0001 

.0003 

3.36 

.0004 

.0015 

3.86 

.0001 

.0002 

3.36 

.0004 

.0014 

3.86 

.0001 

.0002 

3.37 

.0004 

.0014 

3.87 

.0001 

.0002 

3.38 

.0004 

.0013 

3.88 

.0001 


3.30 

.0003 

.0013 

3.89 

.0001 


3.40 

.0003 

,0012 

3.90 

.0000 

.0002 

3.41 

.0003 

.0012 

3.91 

.0000 


3.42 

.0003 

.0012 

3.92 

.0000 


3.43 

.0003 

.0011 

3.93 

.0000 


3.44 

.0003 

.0011 

3.94 

.0000 

.0002 

3.46 I 

.0003 


3.96 

.0000 

.0002 

3.46 1 

.0003 


3.96 

.0000 

.0002 

3.47 

.0003 


3.97 

.0000 

.0002 

3.48 

.0003 


3.98 

.0000 


3.40 

.0002 


3.99 


1 .0001 


The probability of a deviation numerically greater than x is twice the 
probability given in the table. 








































198 


TABLES 


TABLE II 

Deviations of the Normal Curve Corresponding to 
Given Probabilities 


Probability 
of a deviation 
greater than x 

X 

Probability 
of a deviation 
greater than x 

X 

Probability 
of a deviation 
greater than x 

X 

.000 



.9346 

.350 

.3863 

.005 

2.5758 


.9154 

.355 

.3719 

.010 

2.3263 


.8965 

.360 

.3585 

.015 



.8779 

.365 

.3451 

.020 



.8596 

.370 

.3319 

.025 



.8416 

.375 

.3186 

.030 


.205 

.8239 

.380 

.3055 

.035 

1.8119 

.210 

.8064 

.385 

.2924 

.040 


.215 

.7892 

.390 

.2793 

.045 

1.6954 

.220 

.7722 

.395 

.2663 

.050 

1.6449 

.225 

.7554 

.400 

.2633 

.055 

1.5982 

.230 

.7388 

.405 

.2404 

.060 

1.5548 

.235 

.7225 

.410 

.2275 

.065 

1.5141 

.240 

.7063 

.415 

.2147 


1.4758 

.245 

.6903 

.420 

.2019 


1.4395 

.250 

.6745 

.425 

.1891 



.255 

.6588 

.430 

.1764 


1.3722 

.260 

.6433 

.435 

.1637 



.265 

.6280 

.440 

.1610 



.270 

.6128 

.445 

.1383 


1.2816 

.275 

.5978 

.450 

.1257 


1.2536 

.280 

.5828 

.455 

.1130 


1.2265 

.285 

.5681 

.460 

.1004 

.115 


.290 

.5534 

.465 

.0878 


1.1750 

.295 

.5388 

.470 

.0763 

.125 

1.1503 

.300 

.5244 

.476 

.0627 


1.1264 

.305 

.5101 

.480 

.0502 

.135 

1.1031 

.310 

.4959 

.485 

.0376 


1.0803 

.315 

.4817 

.490 

.0261 

.145 

1.0581 

.320 

.4677 

.495 

.0125 

.150 

1.0364 

.325 

.4538 

.600 

.ooflb 

.155 

1.0152 

.330 

.4399 



.160 

.9945 

.335 

.4261 



.165 

.9741 

.340 

.4125 



.170 

.9542 

.345 

.3989 




The probability of a deviation numerically greater than x is twice the 
probability given in the table. 



















TABLES 


199 


TABLE III 

Probabilities op the Normal Curve Correspondino to 
Large Deviations 


X 

Probability of a 
deviation greater than x 

X 

Probability of a 
deviation greater than x 

3 

.00134 99 

6 

9.8660 X 10-“> 

3.5 

.00023 26 

7 

1.2798 X 10->2 

4 

.00003 17 

8 

6.2210 X 10-“ 

4.5 

.00000 33977 

9 

1.1286 X 10-“ 

5 

.00000 02867 

10 

7.6199 X 10-“ 


The probability of a deviation numerically greater than x is twice the 
probability given in the table. 


TABLE IV 

Deviations op the Normal Curve Corresponding to 
Small Probabilities 


Probability of a 
deviation greater than x 

X 

Probability of a 
deviation greater than x 

X 

.005 

2.57583 

.000,000,5 

4.89164 

.000,5 

3.29053 

.000,000,05 

5.32672 

.000,05 

3.89059 

.000,000,005 

5.73073 

.000,005 

4.41717 

.000,000,000,5 

6.10941 


The probability of a deviation numerically greater than x is twice the 
probability given in the table. 




200 


TABLES 


TABLE V 

VaLTTEB op t CORRESPONDINQ TO GiVBN PROBABILITIBS ♦ 


j 

Degrees 

Probability of a deviation greater than t 

freedom n 

005 

■1 

025 

06 

1 

16 

1 

63 667 

31 821 

12 706 

6 314 

3 078 

1 963 

2 

9 925 


4 303 

2 920 

1 886 

1 386 

3 

5 841 

4 541 

3 182 

2 353 

1 638 

1 250 

4 

4 604 

3 747 

2 776 

2 132 

1 533 


5 

4 032 

3 365 

2 671 

2 016 

1 476 


6 

3 707 

3 143 j 

2 447 

1 943 

1 440 


7 

3 499 

2 998 

2 365 

1 896 

1 415 

1 119 

8 

3 355 

2 896 

2 306 

1 860 

1 397 

1 108 

9 

3 250 

2 821 

2 262 

1 833 

1 383 

1 100 

10 

3 169 

2 764 

2 228 

1 812 

1 372 

1 093 

11 

3 106 

2 718 

2 201 

1 796 

1 363 

1 088 

12 

3 056 

2 681 

2 179 

1 782 

1 366 

1 083 

13 

3 012 

2 650 

2 160 

1 771 

1 360 

1 079 

14 

2 977 

2 624 

2 145 

1 761 

1 346 

1 076 

15 

2 947 

2 602 

2 131 

1 763 

1 341 

1 074 

16 

2 921 

2 583 

2 120 

1 746 

1 337 

1 071 

17 

2 898 

2 667 

2 110 

1 740 

1 333 

1 069 

18 

2 878 

2 662 

2 101 

1 734 

1 330 

1 067 

19 

2 861 

2 639 

2 093 

1 729 

1 328 

1 066 

20 

2 845 

2 628 

2 086 

1 726 

1 325 

1 064 

21 

2 831 

2 518 

2 080 

1 721 

1 323 

1 063 

22 

2 819 

2 608 

2 074 

1 717 

1 321 

1 061 

23 

2 807 

2 500 

2 069 

1 714 

1 319 

1 060 

24 

2 797 

2 492^ 

2 064 

1 711 

1 318 

1 059 

26 

2 787 

2 486 

2 060 

1 708 

1 316 

1 058 

26 

2 779 

2 479 ! 

2 056 

1 706 

1 315 

1 058 

27 

2 771 

2 473 i 

2 062 

1 703 

1 314 

1 057 

28 

2 763 

2 467 

2 048 

1 701 

1 313 

1 066 

29 

2 756 

2 462 1 

2 045 j 

1 699 

1 311 

1 0V 

30 

2 760 

2 467 

2 042 

1 697 

1 310 

1 055 

< 

00 

2 676 

2 326 

1 960 

1 645 

1 282 

1 036 


The probability of a deviation numencaUy neater than t is twice the 
probabihty given at the head of the table 

* TbiB table is reproduced from Statistical Methods for Research Workers/* uuth the 
genwous permissiozi of the author. Professor R A Fisher, and the publishers, Messrs 
Oliver ana Boyd 









TABLES 


201 


TABLE V —Continued 

Values op t Corresponding to Given Probabilities 


Degrees Probability of a deviation greater than t 



The probability of a deviation numerically greater than t is twice the 
probability given at the head of the table. 


202 


TABLES 


TABLE VI 


Values of Corresponding to Given Probabilities • 



For larger values of n, the quantity (2x*)^ — ^2n — 1)^ may be used as a 
normal deviate with unit standard deviation. 

*Thi8 table is reproduced from Statistical Methods for Research Workers/* with the 
Muerous permission of the author, Professor R. A. Fisher, and the publishers, Messrs. 
Oliyer and Boyd. 
























TABLES 


203 


TABLE VI —CorUinued 

Values op Correspondinq to Given Probabilities 


Degrees 


Probability of a deviation greater than 


of 


freedom 

n 

.70 

.80 

.90 

.95 

.98 

.99 

1 

.148 

.0642 

.0158 

.00393 

.000628 

.000167 

2 

.713 

.446 

.211 

.103 

.0404 

.0201 

3 

1.424 

1.005 

.584 

.352 

.185 

.116 

4 

2.195 

1.649 

1.064 

.711 

.429 

.297 

5 

3.000 

2.343 

1.610 

1.145 

.752 

.554 

6 

3.828 

3.070 

2.204 

1.635 

1.134 

.872 

7 

4.671 

3.822 

2.833 

2.167 

1.564 

1.239 

8 

5.627 

4.594 

3.490 

2.733 

2.032 

1.646 

9 

6.393 

5.380 

4.168 

3.325 

2.532 

2.088 

10 

7.267 

6.179 

4.865 

3.940 

3.059 

2.558 

11 

8.148 

6.989 

6.578 

4.575 

3.609 

3.053 

12 

9.034 

7.807 

6.304 

6.226 

4.178 

3.571 

13 

9.926 

8.634 

7.042 

6.892 

4.765 

4.107 

14 

10.821 

9.467 

7.790 

6.571 

5.368 

4.660 

16 

11.721 

10.307 

8.647 

7.261 

6.985 

5.229 

16 

12.624 

11.162 

9.312 

7.962 

6.614 

5.812 

17 

13.631 

12.002 

10.085 

8.672 

7.255 

6.408 

18 

14.440 

12.867 

10.865 1 

9.390 

7.906 

7.015 

19 

15.362 

13.716 

11.651 

10.117 

8.567 

7.633 

20 

16.266 

14.678 

12.443 

10.861 

9.237 

8.260 

21 

17.182 

16.446 

13.240 

11.591 

9.915 

8.897 

22 

18.101 

16.314 

14.041 

12.338 

10.600 

9.542 

23 

19.021 

17.187 

14.848 

13.091 

11.293 

10.196 

24 

19.943 

18.062 

15.669 

13.848 

11.992 

10.856 

25 

20.867 

18.940 

16.473 

14.611 

12.697 

11.524 

26 

21.792 

19.820 

17.292 

15.379 

13.409 

12.198 

27 

22.719 

20.703 

18.114 

16.151 

14.125 1 

12.879 

«28 

23.647 

21.688 

18.939 

16.928 

14.847 

13.665 

29 

24.577 

22.475 

19.768 

17.708 

15.574 

14.256 

30 

25.508 

23.364 

20.599 

18.493 

16.306 

14.953 


For larger values of n, tjie quantity (2x^)^ — (2n — 1)^ may be used as 
a normal deviate with unit standard deviation. 


204 


TABLES 


TABLE VII 

5 Per Cent Points of the Distribution of z * 


Degrees of Degrees of freedom ni of greater mean square 

freedom n 2 __ 

of smaller 

mean square 1 2 3 4 5 6 


1 2.5421 

2 1.4592 

3 1.1577 

4 1.0212 

5 .9441 

6 .8948 

7 .8606 

8 .8355 

9 .8163 

10 .8012 

11 .7889 

12 .7788 

13 .7703 

14 .7630 

15 .7568 

16 .7514 

17 .7466 

18 .7424 

19 .7386 

20 .7352 

21 .7322 

22 .7294 

23 .7269 

24 .7246 

25 .7225 



2.6479 

1.4722 

1.1284 

.9690 

.8777 

.8188 

.7777 

.7475 

.7242 

.7058 

.6909 

.6786 

.6682 

.6594 

.6518 

.6451 

.6393 

.6341 

.6295 

.6254 

.6216 
.6182 
'.6151 
.6123 
,6097 

.6073 

.6051 

.6030 

.6011 

.5994 

.5866 

.5738 

.5611 

.5486 



* This table is reproduced from ‘'Statistical Methods for Research Workers,'* with the 
senerous permission of the author, Profeeeor R. A. Fisher, and the publishers, Messrs. 
Oliver and Boyd. 

























TABLES 


205 


TABLE VII—Continued 


6 Per Cent Points of the Distribution of z 


Degrees of 
freedom 
of smaller 
mean square 

Degrees of freedom ni of greater mean square 

7 

8 

10 

12 

24 

00 

1 

2.7335 


2.7442 

2.7484 

2.7688 

2.7693 

2 

1.4814 

1.4819 

1.4826 


1.4840 

1.4861 

3 

1.0922 

1.0899 

1.0866 

1.0842 

1.0781 

1.0716 

4 


.8993 

.8929 

.8886 

.8767 

.8639 

6 

.7921 

.7862 

.7776 

.7714 

.7560 

.7368 

6 

.7184 

.7112 


.6931 

.6729 

.6499 

7 

.6668 

.6576 

.6466 

.6369 

.6134 

.6862 

8 


.6176 


.6945 

.6682 

.6371 

9 

.6969 

.6862 

.6717 

.5613 

.6324 

.4979 


.6714 

.5611 

.6457 

.6346 

.6036 

.4667 

11 

.5514 

.5406 

.6243 

.6126 

.4796 

.4387 

12 

.5347 

.6234 


.4941 

.4592 

.4166 

13 

.6206 

.6089 

.4912 

.4785 

.4419 

.3957 

14 

.6084 

.4964 

.4782 

.4649 

.4269 

.3782 

16 

.4979 

.4866 

.4668 

.4632 

.4138 

.3628 

16 

.4887 

,4760 

.4668 

.4428 

.4022 

.3490 

17 

.4806 

.4676 

.4480 

.4337 

.3919 

.3366 

18 

.4733 

.4602 

.4402 

.4265 

.3827 

.3263 

19 

.4668 

.4636 

.4331 

.4182 

.3743 

.3161 

20 

.4610 

.4474 

.4268 

.4116 

.3668 

.3067 

21 

.4667 

.4420 

.4211 


.3699 

.2971 

22 

.4509 

.4370 1 

.4158 


.3636 

.2892 

23 

.4466 

.4325 



.3478 

.2818 

24 

.4425 

.4283 


.3904 

.3425 

.2749 

26 

.4388 

.4244 


.3862 

.3376 

.2685 

26 

.4364 

.4209 

.3987 

.3823 

.3330 

.2625 

27 

.4322 

.4176 

.3962 

.3786 

.3287 

.2569 

28 

.4292 

.4146 

.3920 

.3762 

.3248 

.2616 

29 

.4266 

.4117 

.3889 


.3211 

.2466 

• 30 

.4239 

.4090 

.3861 

.3691 

.3176 

.2419 

40 

.4063 

.3897 

.3665 

.3476 



60 

.3866 

.3702 

.3447 

.3266 

.2654 

.1644 

120 

.3678 

.3606 

.3236 


.2376 

.1131 

00 

.3490 

•3309 

.3023 

Wa 


















206 


TABLES 


TABLE VIII 

1 Per Cent Points of the Distribution op 2 * 


c 


Degrees of 


Degrees of freedom ni of greater mean square 


freedom m 


of smaller 
mean square 

1 

2 

3 

4 

5 

6 

1 

4.1535 

4.2585 

4.2974 

4.3175 

4.3297 

4.3379 

2 

2.2950 

2.2976 

2.2984 

2.2988 

2.2991 

2.2992 

3 

1.7649 

1.7140 

1.6915 

1.6786 

1.6703 

1.6645 

4 

1.5270 

1.4452 

1.4075 

1.3856 

1.3711 


5 

1.3943 

1.2929 

1.2449 

1.2164 

1.1974 

1.1838 

6 

1.3103 

1.1955 


1.1068 

1.0843 

1.0680 

7 

1.2526 

1.1281 

1.0672 

1.0300 

1.0048 

.9864 

8 

1.2106 

1.0787 

1.0135 

.9734 

.9459 

.9259 

9 

1.1786 

1.0411 

.9724 

.9299 


.8791 

10 

1.1535 

1.0114 

.9399 

.8954 

.8646 

.8419 

■ 11 

1.1333 

.9874 

.9136 

.8674 

.8354 

.8116 

12 

1.1166 

.9677 

.8919 

.8443 

.8111 

.7864 

13 


.9511 

.8737 

.8248 

.7907 

.7652 

14 


.9370 

.8581 


.7732 

.7471 

15 


.9249 

.8448 

.7939 

.7582 

.7314 

16 

1.0719 

.9144 

.8331 

.7814 


.7177 

17 

1.0641 

.9051 

.8229 


.7335 


18 

1.0572 

.8970 

.8138 


.7232 

.6950 

19 

1.0511 

.8897 

.8057 

.7521 


.6854 


1.0457 

.8831 

.7985 

.7443 


.6768 

21 

1.0408 

.8772 

.7920 

.7372 

.6984 


22 

1.0363 

.8719 

.7860 

.7309 

.6916 


23 


.8670 

.7806 

.7251 

.6855 

.6555 

24 

1.0285 

.8626 

.7757 

.7197 

.6799 

.6496 

25 

1.0251 

.8585 

.7712 

.7148 

.6747 

.6442 

26 

1.0220 

.8548 

.7670 


.6699 

.6392 

27 


.8513 

.7631 


.6655 

.6346 

28 


.8481 

.7595 


.6614 

.6303 

29 


i .8451 

.7562 

.6987 

.6576 

.6263 



.8423 

1 .7531 

.6954 


.6826 


.9949 

.8223 


.6712 

.6283 

.5966 

60 

.9784 

.8025 


.6472 

.6028 

.6687 

120 

.9622 

.7829 

.6867 

.6234 

.5774 

.5419 

00 

.9462 

.7636 

.6651 

0.5999 

.5522 

.5152 


’‘‘This table is reproduced from "Statistical Methods for Research Workers,” with the 
Henerous permission of the author, Professor R. A. Fisher, and the publishers, Messrs. 
Jl^yer and Boyd. 































TABLES 


207 


Degrees of 
freedom 712 
of smaller 
mean square 


TABLE VIII —Continued 
1 Per Cent Points of the Distribution of z 

Degrees of freedom n\ of greater mean square 


10 

12 

24 

4.3544 

4.3585 

4.3689 

2.2996 

2.2997 

2.2999 

1.6522 

1.6489 

1.6404 

1.3387 

1.3327 

1.3170 

1.1539 

1.1457 

1.1239 

1.0318 

1.0218 

.9948 


00 




INDEX 


Absolute criteria in regression theory, 126>128 
Analysis of covariance, 150-157 
Analysis of variance, 119 
in confounding, 175 
for correlation ratio, 128-130 

in curvilinear and multiple regression and correlation, 124-126 
diagram, 139 

for dummy treatments, 182-189 
in factorial design, 170 
for Latin square, 166-169 
applied to linear regression, 119-123 
for testing linearity of regression, 130-132 
in partial confounding, 176-182 
for randomized blocks, 162-166 
Arithmetic mean, 11-13, see also Mean 
Array, 57n 
Averages, 11 

appropriateness of different, 16-17 

Bailey, 151, 151n 
Bartlett, 102n 

Beta function, incomplete, 119, 119n 
Binomial distribution, 7, 67-68 
approximated, by Gram-Charlier distribution, 73, 76 
by normal distribution, 69-73 
by Poisson exponential distribution, 73 
testing homogeneity of small samples from, 103-105 
index of dispersion of, 104 
mean of, 70 
moments of, 76 
small samples from, 103-105 
standard deviation of, 70 
Buge, 88n 

Blocks, randomized, 162-166 
Bdcher, 38n 


Camp, 73n, 77n * 

Charlier check, 19, 23, 52 
Chi square, 100 

and variance, connection between, 101 

209- 



210 


INDEX 


Chi-square distribution, 100 
Class interval, 1 
Class limits, 1 

Coeflficient, of correlation, see Correlation coefficient 
of multiple correlation, see Multiple correlation coefficient 
of partial correlation, see Partial correlation coefficient 
of partial regression, see Partial regression coefficient 
of regression, see Hegression coefficient 
of total correlation, 61 
of variation, 20 

Combining estimates, of correlation, 105-108 
of variance, 102 
Comparisons, independent, 173 
Complete set, 171n 
Confidence limits, 78-79, 90-91 
for population mean, 78-79, 90-91 
Confounding, 174-176 
partial, 176-182 
Contingency table, 110 
degrees of freedom in, 111-112 
exact treatment of 2 X 2, 114-115 
testing independence of attributes in, 110-112 
with small frequencies, 112-115 
Continuous variable, 2 
Correction, Sheppard’s 19-20, 22, 118 
Yates’s, 113-115 
Correlated variables, 47 
Correlation, 47 
diagram, 53 
index of, 58-59 
inverse, 47 
multiple, 59-60 
partial, 60-62 

and regression, connection between, 48-50 
table, 50-52 

Correlation coefficient, 47-48 
combining estimates of, 105-108 
computation of, 48, 51-52 
and correlation ratio, relation between, 57-58 
estimate of, 105-106 

testing homogeneity of estimates of, 105-108 
intraclass, 134 

multiple, see Multiple correlation coefficient 
partial, see Partial correlation coefficient 
testing significance of, 83-84, 122 ' 

when population correlation is zero, 97-98 
transformation of, 83, 97, 106 

Correlation coefficients, testing significance of difference between, 84-85 



INDEX 


211 


Correlation ratio, 62-57 
computation of, 54-56 

and correlation coefficient, relation between, 67-68 
testing significance of, 128-130 
Covariance, 29 
analysis of, 150-157 
Crathome, 67n 

Cumulative frequency diagram, 5 
Cumulative frequency table, 2, 3 
Curve, frequency, 4-6 
Gram-Charlier, 76 
normal, 6 

Pearson type III, 6 
Curvilinear regression, 41-43 
testing significance of, 124-125 


Davenport, 118n 
Decile, 17 

Degrees of freedom, in analysis of covariance, 155-157 
in analysis of variance, 133, 138, 145 
for correlation ratio, 129 
in chi-square distribution, 100 
in confounding, 175 
in contingency table, 111-112 
in dummy treatments, 183-189 
in factorial design, 172, 173 
in Fisher’s distribution, 117 
in goodness of fit test, 108-109 

in testing homogeneity of estimates of correlation, 107 

in testing homogeneity of estimates of variance, 102 

for index of dispersion, 104, 105 

in Latin square, 167, 169 

in linear regression, 120-123 

in testing linearity of regression, 131 

in multiple regression, 126 

in parabolic regression, 124-125 

in partial confounding, 177-182 

in randomized blocks, 164 

^n “Student’s” distribution, 89 

in testing whether sample is from uncorrelated material, 97 
in testing significance, of correlation coefficient when population correla¬ 
tion is zero, 97 

of difference between means, 92 
of difference between degression coefficients, 95 
of partial correlation coefficient, 98 
of partial regression coefficient, 96 
of regression coefficient, 94 




212 


INDEX 


Deming^ S8n, 118 
Dependent variable, 27 
Design, experimental, 162 
factorial, 169-174 
Deviation, mean, 20-21 
mean square, 18 
probable, 70n 
root-mean-square, 18 
standard, 18-20 

Deviations, sum of squares of, see Sum of squares of deviations 
Diagram, analysis of variance, 139 
correlation, 53 
cumulative frequency, 5 
frequency, discrete variable, 4 
rectangular frequency, 3 
scatter, 27n 

Difference, between correlation coeflScients, testing significance of, 84-86 
between means, testing significance of, when population standard devia¬ 
tion is known, 79-81 

when population standard deviation is unknown, 91-93 
estimated standard deviation of, 92 
uncorrelated, variance of, 80, 91 

between partial regression coefficients, testing significance of, 96-97 
between proportions, testing significance of, 81-83 
standard deviation of, 81 

between regression coefficients, testing significance of, 94-95 
estimated variance of, 95 
between variables, variance of, 79-80, 91 
between variances, testing significance of, 117-119 
variance of, 79-80, 91 
Discrete variable, 2 
Dispersion, 20 

index of, binomial distribution, 104 
Poisson exponential distribution, 105 
Distribution, binomial, 7, 67-68 
chi-square, 100 
cumulative frequency, 2 
Fisher’s, 117 
frequency, 1 

Gram-Charlier, 73, 76-77 , 

of means in samples from normal population, 77 
normal, 6, 68-75 
* Pearson type III, 6, 100 
Poisson exponential, 7 

of ratio of estimates of variance, 117 ' 

of standard deviations, 101 
^‘Student’s,” 89 
of variances, 101 



INDEX 


213 


Doolittle method, 35 
Dummy treatments, 182-189 


Eisenhart, 126n 
Ekas, 118n 
Error, 138 
probable, 70n 

Estimate, of correlation, 105-106 
of population proportion, 82 
of standard deviation, 88 
of mean, 88 

of regression coefficient, 94 
of variance, 92, 102 

of difference between regression coefficients, 95 
of partial regression coefficient, 95 
of regression coefficient, 94, 95 
Excess, 23 
Expected value, 68 
Experimental design, 162 
Exponential curve, fitting an, 31-33 
Exponential equation, 31-33 
Exponential function, Poisson, 7 
as approximation to binomial distribution, 73 


Factorial, 6, 89 
Factorial design, 169-174 
Fiducial limits, 78-79, 90-91 
for population mean, 78-79, 90-91 
Fisher, Arne, 77, 77n 

Fisher, R. A., 38, 43, 83n, 89, 96n, lOOn, 112n, 11771, 118, 118n, 119, 123. 

127n, 134n, 150n, 151, HOri, 184, 184n 
Fisher^s distribution, 117 
Fisher’s method of fitting polynomial, 43 
Fitting, exponential curve, 31-33 
line of least squares, 30 
multiple linear regression, 35-41 
normal curve, to binomial distribution, 69-73 
to observed data, 73-75 
•parabola, 41-43 
polynomial, 41-43 
Fr4chet, 57, 57n 

Freedom, degrees of, see Degrees of freedom 
Frequency curve, 4-6 , 

Gram-Charlier, 76 
normal, 6 

Pearson type III, 6 



214 


INDEX 


Frequency diagram, cumulative, 5 
discrete variable, 4 
rectangular, 3 
Frequency distribution, 1 
graphic representation of, 3-4 
Frequency table, 1 
cumulative, 2 
Fry, 73n, 76n, 77n 

Function, beta, incomplete, 119, 119»» 
gamma, 89 

Pearson type III, 6, 100 


Gamma function, 89 
Geometric mean, 15 

Geometric representation of analysis of variance, 139 
Goodness of fit test, 108-110 
Gram-Charlicr distribution, 73, 76-77 
as approximation to binomial distribution, 76-77 
Graphic representation of frequency distributions, 3-4 


Harmonic mean, 15 
^‘Highly significant,” 78 
Histogram, 3 

Homogeneity test, for correlations, 106-108 
for small samples from binomial and Poisson distributions, 103-105 
for variances, 102-103 


Incomplete beta function, 119, 119n 

Independence of attributes in contingency table, testing, 110-112 
Independent, stochastically, 89 
Independent comparisons, 173 
Independent variable, 27 
Index, of correlation, 58-59 
of dispersion, binomial distribution, 104 
Poisson exponential distribution, 105 
Integral variable, 2 
Interaction, 138, 174 
Interval, class, 1 

Intraclass correlation coefi&cient, 134 
Inverse correlation, 47 
Irwin, 112ti, 117n, 137n 


Kendall, 2n, 70n 
Kurtosis, 23 



INDEX 


215 


Latin square, 166-169 
Least squares, 27-30 
Likelihood, maximum, 88n, 101 
Limits, class, 1 

fiducial or confidence, 78-79, 90-91 
for population mean, 78-79, 90-91 
Line of least squares, 27-30 
Line of regression, 27-30 
Linear regression, 27-30 
analysis of variance applied to, 119-123 
multiple, 34-41 

testing significance of, 125-126 
testing significance of, 121 
Linearity of regression, testing, 130-132 
Logarithmic paper, 34 
semi-, 31-32 

Main effects, 174 
Maximum likelihood, 88n, 101 
Mean, 11 
arithmetic, 11-13 
computation of, 12-13 
of binomial distribution, 70 

distribution of, in samples from normal population, 77 
fiducial or confidence limits for population, 78-79, 90-91 
geometric, 15 
harmonic, 15 

of normal distribution, 70 
of sample proportions, 81 

testing significance of, when population standard deviation is known, 
77-78 

when population standard deviation is unknown, 88-90 
of variance, 102 
weighted, 13 
Mean deviation, 20-21 
Mean square deviation, 18 
Median, 13-15 
Method, Doolittle, 35 
Fisher’s, of fitting polynomial, 43 
of maximum likelihood, 88n, 101 
of solving normal equations, 35-40 
Mid-value of class, 1, 2 
Mills, 35n 
Modal class, 15 
Mode, 15 9 

Moments, 21-24 
of binomial distribution, 76 
computation of, 23-24 



216 


INDEX 


Multiple correlation, 59-60 
Multiple correlation coefficient, 69-60 
testing significance of, 125-126 
Multiple regression, 34-41 
testing significance of, 125-126 

Non-orthogonal data, 189 
Normal curve or distribution, 6, 68-75 
fitted to binomial distribution, 6, 69-73 
connection with Gram-Charlier distribution, 76 
fitted to observed data, 73-75 
mean of, 70 

standard deviation of, 70 
Normal equations, 28 
for curvilinear regression, 41 
for multiple regression, 34, 41 
numerical solution of, 35-40 
for parabolic regression, 41 

Optimum value, 88n, 101 
of standard deviation, 88n 
of variance, 101 

Order of partial correlation coefficient, 61 
Orthogonality, 170-173 

Parabola, fitting a, 41-43 
Parabolic regression, 41 
Parameter, 88n 
Partial confounding, 176-182 
Partial correlation, 60-62 
Partial correlation coefficient, 60-62 
order of, 61 

testing significance of, 84, 98 
Partial regression coefficient, 34 
testing significance of, 95-96 
estimated variance of, 95 

Partial regression coefficients, testing significance of difference between, 
96-97 

Partition values, 17 

Pearson, 72n, lOOn * 

Pearson type III curve or distribution, 6, 100 
Percentile, 17 
Point binomial, 71 

Poisson exponential distribution, 7 < 

as approximation to binomial distribution, 73 
testing homogeneity of small samples from, 103-105 
index of dispersion of, 105 



INDEX 


217 


Pol 3 momial, fitting a, 41-43 
Fisher’s method, 43 

Population correlation, estimate of, 105-106 

Population mean, fiducial or confidence limits for, 78-79, 90-91 

Population proportion, estimate of, 82 

Population variance, estimate of, 102 

Probability, 67 

Probable deviation, 70n 

Probable error, 70n 

Proportion, estimate of population, 82 
mean of, in samples, 81 
standard deviation of, in samples, 81 
Proportions, testing significance of difference between, 81-83 

Quartile, 17 

Randomized blocks, 162-166 
Range, 1 

Ratio, correlation, see Correlation ratio 
of estimated variances, distribution of, 117 
Rectangular frequency diagram, 3 
Regression, 27 

and correlation, connection between, 48-60 
curvilinear, 41-43 
line, 27-30 
fitting, 30 
linear, 27-30 

analysis of variance applied to, 119-123 
plane, 34, 35 

testing linearity of, 130-132 
multiple, 34-41 
parabolic, 41 

theory, absolute criteria in, 126-128 
Regression coefficient, 28 
partial, 34 

testing significance of, 95-96 
estimated variance of, 95 
variance of, 94 
estimated, 94, 95 

]^egression coefficients, partial, testing significance of difference between, 
96-97 

testing significance of difference between, 94-95 
Rejection of observations, 86n • 

Relative price, 24 
Replication, 172 t 

Rider, 86» 

Rietz, 2171, 61n, 67n, 9071 
Root-mean-square deviation, 18 



218 


INDEX 


Samples from binomial and Poisson distributions, 103-105 

Sanders, 150n, 156n 

Scatter diagram, 27n 

Semi-logaiithmic paper, 31-32 

Sheppard’s correction, 19-20, 22, 118 

‘‘Significance,” 78 

Significance tests, see Testing significance of 
“Significant,” 78 
^''^kewness, 22 

Small samples from binomial and Poisson distributions, 103-105 
Snedecor, 118, 118n 
Snedecor’s tables, 118, 119, 126, 167 
Solution of normal equations, numerical, 35-40 
Split-plot arrangement, 191 
Square, Latin, 166-169 
Standard deviation, 18-20 
of binomial distribution, 70 
of difference between means, estimated, 92 
of difference between proportions, 81 
of normal distribution, 70 
optimum value of, 88n 
of a proportion, 81 

of a regression coefficient, estimated, 94 
Statistic, 88n 

Stochastically independent, 89 
“Student’s” distribution, 89 

Subdivision of variance into more than two portions, 137-150 
Sum of squares of deviations, among class means, 133, 134 
within classes, 133, 134 
from multiple regression function, 41 
from regression curve, 42 
from regression line, 29 
from regression plane, 35 

t, 89 

variance of, 89 

Testing goodness of fit, 108-110 

Testing homogeneity, of estimates of correlation, 106-108 
of estimates of variance, 102-103 

of small samples from binomial and Poisson distributions, 103-105 , 

Testing independence of attributes in contingency table, 110-112 
Testing linearity of regression, 130-132 

Testing whether two groups are samples from same population, 11 In 
Testing whether sample is from uncorrelatcd material, 97-98 
Testing whether samples came from equally corrected populations, 106-108 
Testing significance, of correlation coefficient, 83-84, 122 
when population correlation is zero, 97-98 
of correlation ratio. 128-130 



INDEX 


219 


Testing significance, of difference, between correlation coefficients, 84-85 
between means, when population standard deviation is known, 79-81 
when population standard deviation is unknown, 91-93 
between partial regression coefficients, 96-97 
between proportions, 81-83 
between regressions coefficients, 94-95 
of intraclass correlation coefficient, 134 
of linear regression, 121 

of mean, when population standard deviation is known, 77-78 
when population standard deviation is unknown, 88-90 
of partial correlation coeflScient, 84, 98 
when population correlation is zero, 98 
of partial regression coefficient, 95-96 
of ratio between variances, 117-119 
of regression coefficient, 93-94 
of variance, 101 
Total correlation coefficient, 61 
Transformation, of chi-square distribution, 101 
of correlation coefficient, 83, 97, 106 
of exponential curves, etc., 31-34 
of ratio of variances, 117 
of variables, 6, 69 
Trend line, 27 

Type A distribution, Gram-Charlier, 73, 76-77 
Type III distribution, Pearson, 6, 100 

Uncorrelated material, testing whether sample is from, 97-98 
Unequal numbers in classes, 136, 150 

Variable, continuous, 2 
dependent, 27 
discrete, 2 
independent, 27 
integral, 2 
Variance, 18-20 

analysis of, see Analysis of variance 
and chi square, connection between, 101 
within and among classes, 132-137 
computation of, 18-19 

^f difference, between uncorrelated means, 80, 91 
between regression coefficients, estimated, 95 
between variables, 79-80, 91 

distribution of, 101 • 

distribution of ratios of estimates of, 117 

testing homogeneity of tstimates of, 102-103 

testing significance of ratio between estimates of, 117-119 

of mean, 77 

mean of, 102 



220 


INDEX 


Variance, optimum value of, 101 
of partial regression coefficient, estimate of, 95 
population, estimate of, 102 / 
of regression coefficient, 94 
estimated, 94 

subdivision into more than two portions, 137-150 
testing significance of, 101 
of t, 89 
Variate, 69w 

Variation, coefficient of, 20 
u;, 117 

Weighted mean, 13 
Wilks, 189n 
Wishart, 150n, 156n 

Yates, 107n, 112n, 150n, I70n, 184, 184n, 189n 
Yates’s correction, 113-115 
Yule, 2n, 70n 

s, 117 




